Jurgen Habermas is one of the leading social theorists and philosophers of the post-Second World 
War period in Germany, Europe, and the US, a prodigiously productive journalist, and a high-profile 
public intellectual who was at the forefront of the liberalization of German political culture. He is 
often labelled a second-generation Frankfurt School theorist, though his association with the 
Frankfurt School is only one of a rather complex set of allegiances and influences, and can be 
misconstrued. This entry will begin with a summary of Habermas’s background and early and 
transitional works, including his influential concept of the public sphere, before moving on to discuss 
in detail his three major philosophical projects: his social theory, discourse theory of morality (or 
“discourse ethics”), and discourse theory of law and democracy. It will then more briefly address 
Habermas’s methodology and philosophical framework (rational reconstruction and postmetaphysical 
thinking), his applied political theory, focusing on issues of national identity and international law, 
and finally his recent work on religion. 
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1. Biography 
1.1 Biographical Introduction 


Habermas was born in June 1929 and brought up in provincial North-Rhine Westphalia, to 
conservative, educated middle-class parents, who had been neither critical nor strongly supportive of 
the Nazi regime. In 1944 he was called up to man the defences on the western front. A little over a 
year later he was shaken to his core by what he learnt of the Nazi atrocities from the Nuremberg 
Trials, and news coverage of the Holocaust. Thus, although still in his teens, he experienced 1945 as 
a turning point that would shape his political and cultural outlook. As he put it frankly in an interview 
in 1979: 


I am myself a product of “reeducation” ... By this I 
mean that ... we learnt that the bourgeois constitutional 
state in its French, or American, or English form is an 
historical achievement. (1992a: interview 3, 79) 


Two moments exemplify Habermas’s complex position in-between the generations of 1945 and 1968. 
In 1953, when Habermas was a student at Göttingen University, he published a critical essay in 

the Frankfurter Allgemeine Zeitung concerning Heidegger’s remark about “the inner truth and 
greatness of the Nazi movement” that Heidegger had written in his lectures on metaphysics in 1935, 
and then failed to retract or alter in 1953 when republishing those lectures. (1971c [1977]) In 1968, 
at the height of the student protests in Germany, Habermas, who had been critical of the policing 
that had resulted in the killing of Benno Ohnesorg at a student demonstration the year before, 
directly criticized the students for acting out revolutionary fantasies, and for provoking the 
authorities into violence. He used the phrase “left-wing fascism”, a term he later admitted was too 
harsh (Miller-Doohm 2016: 141). Instead, he urged them to put the latitude granted to them by 
liberal democratic institutions to work in the service of a “radical reformism” (Specter 2010: 111- 
115). 


Habermas studied German philosophy and literature at Bonn, and wrote his doctoral dissertation on 
“The Absolute and History: the Ambivalence of Schelling’s Thought”. He came to Frankfurt in 1956, 
where he was Theodor Adorno’s Assistent at the Institute for Social Research for three years. In 1959 
he left for Marburg, having effectively been shouldered out by Max Horkheimer, who considered him 
a dangerous Marxist, and who tried to have him dismissed (Muller-Doohm 2016: 84-86; Habermas 
1992a: interview 8, 218). In Marburg, he wrote his habilitation dissertation, The Structural 
Transformation of the Public Sphere, under Wolfgang Abendroth, one of the few Marxist academic 
philosophers in the post-war Federal Republic. Habermas, though often deemed a member of the 
Frankfurt School, was, in reality, at the institute for a very brief period. Whilst there, he recalls, 
“Critical Theory, at Frankfurt School—there was no such thing ... no coherent doctrine” (Habermas 
1992a: interview 4, 98). So it is misleading to say that he was or became a “member” of the Frankfurt 
School. In truth, he arrived there as an outsider, and while there, briefly, ploughed his own furrow. 


Habermas returned to Frankfurt, after a short period at the University of Heidelberg, where he 
succeeded Horkheimer, with whom he soon reconciled, as Professor of Philosophy and Sociology. He 
declined to become director of the institute. In Frankfurt, Habermas spent the latter half of the 1960s 
teaching in febrile and tumultuous political circumstances not conducive to research. In 1971 he 
became director of the Max Planck Institute for the Study of Living Conditions in the Scientific and 
Technical World in Starnberg, Bavaria, where he conducted the research which led to his magnum 


opus, the two-volume Theory of Communicative Action. The year his magnum opus was published, 
1981, Habermas resigned from the Max Planck Institute under unhappy circumstances, and again 
returned to Frankfurt. There he would remain, but for various visiting professorships in the US, until 
his retirement in 1994. Landmark publications during these years include many essays on moral 
philosophy, and Between Facts and Norms in 1992, Habermas’s major work in political and legal 
philosophy. Throughout his life Habermas has enthusiastically played the role of the public 
intellectual, taking part in disputes about positivism in the social sciences, the historical uniqueness 
of the Holocaust, German reunification, genetic engineering, and secularism and religion. He is the 
recipient of numerous honorary doctorates and prizes, including the Adorno Prize of the city of 
Frankfurt and the Kyoto Prize of the Inamori Foundation (Muller-Doohm 2016: 340). 


1.2 The Public Sphere 


The public sphere is one of Habermas’s most well-known concepts, introduced in his habilitation 
thesis, published in 1962 as The Structural Transformation of the Public Sphere: An Inquiry into a 
Category of Bourgeois Society. Belonging to neither the state, the economy, nor the family, the public 
sphere is where private individuals come together to communicate about matters of general concern. 
It is the location of the public use of reason and the place where “public opinion” is 

formed. Structural Transformation is a reconstructed history of the rise and fall of the public sphere 
focused on Britain, France, and Germany from the early modern era to the mid-twentieth century. In 
the Middle Ages, there was a merely “representative” public sphere, in which kings and nobles 
displayed their status before society (1962 [1989: 7-10]). The bourgeois public sphere begins to 
emerge in the seventeenth and eighteenth centuries, at first in the guise of a literary public sphere. 
In coffee houses, salons, and literary societies, the new reading public came together to discuss 
novels—Habermas cites Samuel Richardson’s Pamela as an example (1962 [1989: 31-6, 49-50, 174)). 
Skills of critical reasoning first developed in the journals of the literary public sphere were 
subsequently applied to the political public sphere, where public affairs rather than literary texts are 
the objects of criticism. In this period the modern state was emerging, as political authority was 
gradually depersonalized and vested in more-or-less independent bureaucratic institutions, rather 
than the person of the monarch (1962 [1989: 17-8]). Simultaneously, the development of mercantile 
capitalism endowed merchants with unprecedented wealth and influence, and an ever-greater need 
for accurate information about market conditions. This need was met by news-sheets and gazettes, 
which soon turned their attention to state policy as much as commodity prices (1962 [1989: 20-2]). 
The bourgeois public sphere thus developed concomitantly with both the capitalist economy and the 
Westphalian sovereign state, and flourished during the high point of bourgeois-liberal politics in the 
eighteenth and nineteenth centuries. 


The bourgeois public sphere is constituted by an ideological separation between public and private. 
The state and politics are deemed “public”, whereas civil society, the market economy, and the family 
are deemed “private”. The public sphere, according to Habermas, mediates between these two 
realms (1962 [1989: 30]). Participants in the bourgeois public sphere are private individuals, coming 
together to rationally and critically discuss public affairs, above all the actions of governments; it is 


a realm of private individuals assembled into a public 
body who as citizens transmit the needs of bourgeois 
society to the state, in order, ideally, to transform 
political into “rational” authority within the medium of 
this public sphere. (1964 [1974: 53]) 


Habermas would later refer to this as the generation of “communicative power”, which can 
legitimate the political system’s actions if yoked to the latter’s “administrative power” (1992b 
[1996b: 147-50]). As members of the public, private individuals bring decisions into the public sphere 
where they are open to rational discussion and criticism. In the process, participants form and 
articulate the general interest of society, drawing on ideas of truth, justice and human rights. 


Needless to say, participants in the bourgeois public sphere were de facto almost all educated male 
property-owning members of the bourgeoisie, along with some sympathisers from the aristocracy. 
Habermas has acknowledged the selective membership of the bourgeois public sphere during its 
heyday (1992c: 425-430), although critics have charged that he does not pay sufficient attention to 
the way it was constituted by excluding propertyless workers (Negt & Kluge 1972 [2016]) and, above 
all, women (Landes 1988). Despite its limitations, Habermas argues that the bourgeois public sphere 
nevertheless embodied certain principles and ideals, never fully realized, that are vital to any 
flourishing democratic society. It was thus both an “ideal” and an “ideology” (1962 [1989: 112]). 
Since differences in social status between interlocutors were bracketed as “private” matters (1962 
[1989: 36]) the public sphere was in theory universal, open to any literate person (1962 [1989: 37]). 
This ensured that rational argumentation was calibrated to universal standards of validity, not to the 
relative status of interlocutors, and could function as a cooperative search for truth and justice (1962 
[1989: 54)). 


Habermas’s approach in this early work can be described as one of historical reconstruction in the 
service of internal criticism. He reconstructs an “ideal type” of the bourgeois public sphere, in order 
to criticize the really existing public spheres of modern democracies. 


The structural transformation which marks the beginning of the end of the bourgeois public sphere 
involves a re-definition of the public/private distinction. Under conditions of mid-twentieth century 


“welfare-state mass democracy”, (1962 [1989: 208]) state and society became ever more entangled 
as governments pursued interventionist economic policies and expanded welfare provision. At the 
same time, non-state actors such as pressure groups, corporations, and political parties played an 
increasing role in governance (1962 [1989: 142]). Habermas refers to this process as 
“refeudalization” (1962 [1989: 200-1]). 


Habermas sees the modern public sphere as, in many ways, the victim of its own success. As it 
expanded far beyond its original basis of educated male property-owners, material inequalities could 
no longer be set aside, but rather became the subject of public debate (1962 [1989: 127]). And this 
debate was no longer a matter of rational-critical analysis of state action by the assembled public, but 
of negotiation between interest groups which bypass public reason. Instead of the approximation of 
society to the ideal type, what emerged was an impoverished pseudo-public sphere, lacking its 
original capacity for rational-critical discourse, easily manipulated by states, corporations, and 
interest groups using the techniques of “public relations” (1962 [1989: 176, 236]). Its role now, as in 
the feudal era, is to acclaim decisions which have already been made. 


Habermas continues to make use of the concept of the public sphere in his later works (1973a [1975: 
37-8, 48]; 1992b [1996b]), developing a detailed account of its place in modern societies (1992b 
[1996b: 359-87]; 2008b [2009: chapters 8 & 9]). In his original formulation, there was a tendency to 
assume the existence of a single unified public sphere for a single polity. In response to Nancy 
Fraser’s discussion of “subaltern counterpublics” (Fraser 1992) and acknowledging his own earlier 
neglect of “plebian public spheres”, Habermas now concedes that there may be a multitude of 
intersecting public spheres within a given society, focusing on different communities and topics but 
with porous boundaries that allow flows of communication to pass between them (1992c: 424-5; 
1992b [1996b: 373-4]). The closest thing to a “universal” public sphere is the political public sphere, 
focused on the political system. The political public sphere acts as a “sounding board” for problems 
which affect society as a whole, as well as a “filter-bed”, filtering out contributions to public 
discourse which represent generalizable interests (2008b chapter 11 [2009: chapter 9, 143]). In this 
manner, reflexive public opinion is formed and communicative power is generated. One of the most 
salient features of the contemporary political public is its division into “formal” and “informal” 
segments, the former denoting the highly regulated discourse of elected politicians, judges, 
parliaments, and courts, and the latter the “wild”, unregulated flows of communication outside these 
spaces (2008b chapter 11 [2009: chapter 9, 159-62]). Habermas argues that the right kind of 
feedback between formal and informal public spheres is vital for legitimating the political system’s 
actions. At the same time, there is no reason to think that public spheres must stop at national 
borders. In an increasingly interdependent world, global attention can be focused on single issues— 
Habermas mentions the wars in Vietnam and Iraq as examples—creating, at least temporarily, a 
transnational public sphere (2004: chapter 3 [2006c: 39-48]). The political legitimacy of 
transnational polities like the EU hinges on whether the public spheres of its member states can act 
together, functioning as a European public sphere (2008b chapter 9, chapter 11 [2009: chapter 6, 
87-8, chapter 9, 181-3]). 


1.3 Early Works (1964-71) 


Knowledge and Human Interests is a historical reconstruction of the prehistory of positivism and 
scientism. The history Habermas reconstructs is a decline and fall of self-reflection in Wissenschaft— 
science in the broadest sense. His particular interest is in Erkenntniskritik, namely the tradition of 
critical philosophy from Kant, German Idealism, and Hegel through to Marx, which, on his account, is 
gradually side-lined by ways of thinking that employ the methods of positivistic natural and social 
science. His thesis is: “That we disavow reflection is positivism” (1968b: 9 [1971a: vii]). 


Habermas’s analysis takes its cue from Lukács’ idea of reification—the idea that beings, forms of life, 
and social relations assume the appearance of nature (Lukacs 1971: 83-223). Habermas’s analysis 
reveals this illusory independence and naturalness to be historical, and in principle reversible. 
Horkheimer and Adorno’s thesis that: “All reification is a forgetting” also animates the analysis 
(Horkheimer & Adorno 1971 [2002: 191]). Positivism in the empirical and social sciences, and 
historicism in the cultural sciences, Habermas argues, exhibit a merely contemplative stance toward 
their respective objects, an attitude that obscures the point that science and knowledge is 
fundamentally a human enterprise that serves human interests, which are rooted in the natural 
history of the species (1968b [1971a: 301-4]). 


According to Habermas, there are three knowledge-constitutive interests. The empirical and natural 
sciences are governed by the cognitive interest in the technical control of objectified processes. The 
historical-hermeneutical sciences are shaped by a practical interest in orienting action and reaching 
understanding, while self-reflection (and Erkenntniskritik) are determined by a cognitive interest in 
emancipation and in Mundigkeit—autonomy and responsibility (1968b [1971a: 313-314]). 
Habermas’s overall aim is to explain how Marxism, and social theory more broadly, succumbed to a 
positivistic self-misconception, while rescuing the animus of Marx’s theory of society for critical 
social theory, by connecting it with the interest in emancipation and autonomy, and with a method of 
critical self-reflection. 


1n 


Habermas’s long essay “Technology and Science as ‘Ideology’” is a festschrift for Herbert Marcuse, 
and not an uncritical one (1968a [1970]). It deals with topics he broached in the volume of 

essays Theory and Practice (1971 [1973b]) that broadly argue against the reduction of political 
theory from a body of thought answering practical questions of how one should live, to “a form of 
social engineering that dispenses with public discourse” (Celikates & Jaeggi 2009 [2017: 261]). 


Habermas develops the following theses that exhibit the major concerns of his early work, based on 
the following diagnosis: Capitalist societies institutionalize the aim of economic growth, which has 
the effect of expanding sub-systems of instrumental reason. In that context, science and technology 
cease to be connected to a realm of values that help people answer practical questions of how they 
want to live, and become absorbed instead into the economic and administrative systems as forces of 
production geared to the aim of continuous economic growth. Politics, instead of residing ina 
popular practice of democratic self-determination, atrophies to the technocratic administration of 
public affairs in the hands of small groups of experts, leading to a depoliticization of ordinary citizens 
(Celikates & Jaeggi 2009 [2017: 261]). The question is who can reconnect technocratic politics with 
the “good life”. Habermas’s answer is that students may be part of the solution, since their protests 
do not aim at securing them “a larger share of social rewards”, but are targeted against the 
reduction of social and political life to economic growth and individual gain (1970: chapter 6, 120- 
122). 


1.4 Transitional Works (1971-1982) 


Around the time of his move to Starnberg in 1971 Habermas initiated a complete overhaul of his 
theoretical framework for social theory, ultimately leading to the development of his mature 
communicative paradigm (McCarthy 1978). He also produced some seminal transitional works which 
dealt with the issues raised in the early work. Legitimation Crisis (1975), or to give the full original 
title, Legitimationsprobleme im Spatkapitalismus (1973), is a sketch of an incipient research 
programme that sets out from a critique of Marx’s social theory, which sees crises in capitalist 
societies as arising not from the immiseration of the working classes and class struggle, which have 
largely been pacified by the welfare state and economic growth, but from legitimation deficits due to 
alterations in the constellation of economics and politics. The problem posed by legitimation crises, 
which are specific to late capitalism, is how the increasing intervention of the state in economic 
affairs can be made legitimate to those who are affected by these state interventions, and hold the 
state responsible for their effects. In conclusion, Habermas develops some hypotheses about how 
such crises can be resolved. The best-case scenario is that legitimation takes place through 
discursive justification, on the basis of norms embodying generalizable interests, and, failing that, 
through various kinds of compromise (1973a [1975: 112-114]). On this model normative structures 
and ideals of consensus become the key vantage point through which societies can be understood 
and criticized by the social theorist. 


This thesis is further elaborated in the collection of essays entitled Zur Rekonstruction des 
historischen Materialismus (1976a). In this volume Habermas challenges the Marxian assumption 
that developments in the sphere of social integration are determined by developments in the sphere 
of material production. By contrast he posits a logic of the development of normative structures, 
which represent the institutional analogues of the stages of cognitive development of moral 
consciousness in individuals as developed and tested by the cognitive moral psychologist Lawrence 
Kohlberg (1976a: 9-49 [1979: 95-130]). These normative structures represent a directional sequence 
of discrete stages that gain in complexity and comprehensiveness as they develop, and make 
collective learning possible. They allow different kinds of reasons to count as legitimations for the 
relevant kinds of social structures and levels of social integration, and serve as normative standpoints 
in the light of which societies can be understood and criticized. Crucially, they exhibit a logic of 
development altogether independent from that of the forces of production. 


At the time, Marxists criticized Habermas’s work both for abandoning central tenets of Marxism and 
for conceding too much to Niklas Luhmann’s system’s theory (Ebbighausen 1976). That said, 
Habermas denies Luhmann’s central claim that legitimacy is nothing but the motiveless acquiescence 
of citizens to the binding decisions of an administrative mechanism (Luhmann 1969). As for 
Habermas’s Marxism, the very focus of his transitional work on the system crises of capitalist society 
betrays not only the theoretical aspiration to understand capitalism, but also the practical aspiration 
to overcome it, or at least to understand how it could be overcome. That said, since the theory of 
legitimation crisis, as McCarthy noted, is not addressed to any agent of social transformation, it must 
ultimately remain content with diagnosing crisis tendencies (McCarthy 1978). 


2. Habermas’s Mature Social 
Theory: The Theory of Communicative 
Action 


The idea of reason, which is differentiated in the 
various claims to validity, is necessarily built into the 
way in which the species of talking animals reproduces 
itself. (2001a: chapter 5, 85) 


Habermas’s mature work begins with The Theory of Communicative Action (1981 [1984a/1987]), the 
fruits of a long and difficult decade he spent at the Max Planck Institute in Starnberg (Muller-Doohm 
2016: 214). It is an ambitious and wide-ranging work, which provides a general framework within 
which several related research programs are arranged. It comprises: 


A sketch of a unified theory of meaning and action. 
A typological theory of social action. 
A social ontology. 


cae ae 


An outline of a critical social theory tied to (1)-(3). 


This section will discuss these topics in turn. 


2.1 Habermas’s Pragmatic Theory of 
Meaning 


While the theory of communicative action is designed to answer questions in social theory, Habermas 
also considers it as a contribution to the theory of meaning (1984b: 604). There are three pillars to 
this theory: 


1. The first comes from Karl Biihler’s Organon Model of language according to which language 
is triadic, with three functions corresponding respectively to the objective world, the hearer, 
and the speaker (or the third, second and first person): a cognitive function, an appeal 
function, and an expressive function. 


2. The second pillar is speech-act theory, and in particular the idea of illocutionary force or 
meaning, which was developed by J. L. Austin and John Searle. 


3. The third pillar is “formal semantics”, the truth-conditional theory of meaning, and in 
particular Michael Dummett’s “verificationist” critique of it. 


These three pillars form the basis of what Habermas calls “formal pragmatics” or the pragmatic 
theory of meaning, the basic idea of which is that “We understand a speech-act when we know what 
makes it acceptable” (1984a: 297 hereafter TCAI): to understand what a speaker means the hearer 
has to have access to the reasons for the speaker’s utterance. 


The first pillar, Buhler’s functional schema of language, is important as a guiding assumption of 
Habermas’s theory. Habermas sets so much store by Buhler’s model because it encompasses the 
entire field of linguistic meaning, and gives equal weight and priority to its three dimensions: what is 
intended by the speaker, what is said in the content of the utterance, and what is done with that 
utterance. All three dimensions are present in what Habermas considers as the original mode of 
communication whereby a speaker, S, reaches understanding with another person, H, about 
something (1988b [1998b chapter 6: 279] [1992a: 58]). The triadic architectonic radiates into all 
aspects of Habermas’s theory: the thesis that there are three validity claims: to truth, rightness and 
sincerity (TCAI: 307); that speakers can adopt three attitudes: an objectivating, a norm-conformative, 
and an expressive attitude (TCAI: 309); that speakers through their utterances take up relations to 
three “worlds”: the objective world of states of affairs, the social or intersubjective world of 
legitimate social orders, and the subjective internal world (TCAI: 49-52, 60, 236, 308); and finally 
that there are three basic modes of speech which forms the basis of the classification of speech-acts: 
constatives or assertoric speech-acts, regulative speech-acts (such as imperatives or requests), and 
expressive speech-acts (TCAI: 309). Each of these triadic distinctions nests in each other. 


To the extent that there is an argument in Theory of Communicative Action for the triadic structure 
itself, it rests on the basic claim that there are three equiprimordial, meaning-critical validity claims. 
Every speech-act simultaneously makes a claim to truth, to rightness, and to truthfulness. That 
means a speech-act can be taken up by the hearer, and assessed in the light of its propositional truth, 
normative rightness, or the sincerity with which it is expressed. If accepted, this means that 
agreement (Einverständnis) is reached “simultaneously on three levels” (1981 [1984a: 307]). In 
defence of this view Habermas argues that a speech-act can always be rejected from three 
perspectives: in the light of its assertibility conditions, its normative justification, or the sincerity of 
the speaker (1981 [1984a: 306]; 1988b essay 4 [1998b: 231]; 1988b essay 6 [1998b: 296]; 1999a 
[1998b: 317]). However, the claim that needs to be defended, not assumed, is that the validity claim 
for every speech-act can be rejected from three and only three perspectives. And as Dorschel claims, 
an assertion or utterance might be rejected in virtue of the volume or style with which it is uttered 
(Dorschel 1988: 8-9). In the final analysis, the “argument” that any utterance can be rejected from 
three and only three perspectives is question-begging. So the triadic structure, stemming from 
Buhler’s schema remains best thought of as a hinge assumption. 


The second pillar is speech-act theory. Because speech-act theory construes speech as action, or, to 
use Austin’s phrase, as “doing things” with words, it is well suited to provide the basis for a unified 
theory of meaning and action. According to the theory, a speech has both propositional content, p, 
and illocutionary force M. So the meaning of an utterance Mp—“the ice is thin”—can be both a 
statement about the way the world is, and, say, depending on the context, a warning. 


That said, Habermas’s main focus is on illocutionary acts, the aims of which, in contrast to 
perlocutionary acts, he contends, can always be made manifest (1986b [1998b chapter 3: 202)). 
When a speaker makes a declaration or a promise they thereby signal to the hearer what they are 
doing. The key to Habermas’s use of the term “illocutionary” is that he identifies and specifies a 
putatively universal internal mechanism by which speakers realize their various illocutionary aims: 
they make validity claims for their utterance in order to reach understanding 


(Verstandigung or Einverstandnis). Speakers do this by making an implicit guarantee that they can, if 
necessary, adduce good reasons for their utterance. Hearers, for their part, are always free to 
respond with a “yes” or “no” (1981 [1984a: 302]) to this validity claim. When they respond with a 
“yes”, speaker and hearer reach understanding or agreement. “Reaching understanding is the 
inherent telos of human speech” (1981 [1984a: 280]). 


The third pillar is formal semantics. To explain the notion of a meaning-critical validity claim, 
Habermas enrols Dummett’s verificationist critique of truth-conditional semantics, and the epistemic 
turn in formal semantics (1981 [1984a: 316-8]; 1981 [1998b chapter 2: 153]). Dummett argues that 
justification is an epistemic idea, but truth is not, and that the truth-conditions of many sentences are 
unknowable, even where their justification conditions are not. He proposes the view that we 
understand the meaning of a sentence when we know the conditions under which it is assertible, 
rather than the conditions under which it is true (Dummett 1993: 45; Heath 2001: 120-121; Fultner 
2011a: 60-62). 


Habermas takes Dummett’s thought and extends it to natural languages, and to the pragmatics of 
meaning and understanding. This is why he calls his approach “formal pragmatics”, rather than 
“formal semantics”. 


[I]t is possible to generalize Dummett’s explanation. We 
understand a speech-act when we know the kinds of 
reasons that a speaker could provide ... claim validity 
for his utterance—in short when we know what makes 
it acceptable. (1986b [1998b chapter 3: 232]) 


But as Heath points out this is problematic. For there is a semantic dimension to, and motivation for, 
Dummett’s idea that to know the meaning of a sentence is to know the conditions under which it is 
assertible. It offers a unified explanation of the compositional structure of language, namely of how 
one can construct an infinite number of meaningful sentences out of a finite number of semantic units 
and the rules for their composition. In turn that explains how we can understand the meaning of a 
sentence we have never encountered before. This may work for assertions, but it is inapplicable to 
the pragmatic dimension of meaning, to the illocutionary force of utterances, which lacks a 
compositional structure. So it is also unclear how it would work for the other kinds of speech-acts 
such as regulatives and expressives. 


This strongly suggests, as Heath argues, that there may after all be only one validity claim that is 
“meaning-critical”, or “internal” in the sense that it is constitutive of the meaning of utterances, 
namely the validity claim to the truth of assertions or of the propositional components of other kinds 
of speech-act (Heath 2001: 115-6). It is potentially damaging for Habermas also for another reason. 
It shows that to understand an utterance that makes a rightness claim one need not know how the 
utterance or claim is justified. Understanding an utterance need not involve accepting reasons for 
action. Recall that Habermas insists that the illocutionary aim of the speaker is not only to 

be understood, in the sense that addressees recognize the sense of her utterance, but agreed with, 
in the sense that they also accept the relevant reasons that the speaker could adduce in support of 
their utterance. In the case of utterances that make rightness claims these are reasons to act or 
behave in certain ways. Habermas must establish the latter, because his whole theory depends on the 
claim that the normative commitments unavoidably generated in speech reach over into the 
subsequent action sequence (1981 [1984a: 302-3]). 


Habermas later makes a move that appears to address this problem without solving it: he claims 
truth to be paradigmatic of validity, and rightness to be merely analogous with truth (1999a [2003a: 
229]). However, he does not say what the analogues are, nor does he explain what the basis for the 
analogy is (Finlayson 2005). To claim that that validity claims to rightness are analogous to validity 
claims to truth, in that they determine the meaning of the utterances that make them, and can play 
the role as a core component of a theory of meaning, is to beg the important questions. 


2.2 A Theory of Social Action 


The pragmatic theory of meaning is intended to provide the theoretical framework for Habermas’s 
social theory, which consists of a typological theory of human action. Habermas offers a number of 
typologies. He distinguishes four different “models” of action: teleological action, subdivided into 
instrumental and strategic action; normatively regulated action; dramaturgical action; and 
communicative action (1981 [1984a: 85ff]). He later offers the following more rudimentary typology 
which divides action on the horizontal axis into success-oriented and consensus-oriented action, and 
on the vertical axis between “non-social (individual) and social action (1981 [1984a: 285]). 


Action Orientation 


Success Consensus 
Non- Instrumental oes 
Action Social Action 
Situation Ree 
Social Strategic Communicative 
Action 


At the heart of this typology is the distinction between communicative action (rationality) on the one 
hand and instrumental and strategic action (rationality) on the other. 


Habermas defines communicative action in various ways but always to do with agreement on the 
basis of validity claims. In one place, he claims that it comprises “linguistically mediated actions in 
which all participants pursue illocutionary aims and only illocutionary aims” (1981 [1984a: 295)). 
That, however, is to put the point too strongly since agents don’t only aim to understand and to be 
understood (Steinhoff 2009: 35-6). A better definition is that communicative action is action in which 
participants pursue illocutionary aims “without reservation” also when they pursue “perlocutionary 
goals” via “illocutionary goals already achieved” (1986b [1998b chapter 3: 241]). This is important 
because Habermas correlates instrumental action with perlocutionary aims and communicative 
action with illocutionary aims, and he maintains that instrumental and strategic action depends on 
communicative action, but not vice versa. The idea is that agents, through their utterances, make 
validity claims (to truth, rightness and truthfulness) on which the meaning of their utterances 
depends, which are then taken up by their interlocutors and form the basis of understanding and any 
subsequent interactions. 


In response to the basic worry that he conflates reaching understanding of an utterance with 
reaching agreement on a norm of action, Habermas introduces a distinction between weak and 
strong communicative action. He tends to make the distinction in terms of the difference between 
utterances for which speakers only make validity claims to truth, and those for which they make 
validity claims to rightness and hence offer normative reasons that “bind their wills” (1999a [1998b 
chapter 7: 327]). In weak communicative action the reasons that determine meaning are also 
supposed to guide action by providing information about the way the world is: in strong 
communicative action the reasons that are supposed to determine meaning are action guiding 
because they are practical reasons based on shared intersubjective norms (1999a [1998b chapter 7: 
326-7]). 


Habermas contrasts communicative action with instrumental action, which he takes to be action 
whereby agents select the best or only means in order to achieve the agents’ ends (1981 [1984a: 
285]). He construes strategic action as a social variant of instrumental action, whereby agents seek 
to influence other agents, in order to achieve their ends, often (though not always) via the medium of 
language (1981 [1984a: 285]). Habermas assumes that all action is rational. But different action 
types involve different kinds of rationality—instrumental and communicative. He also allows that all 
action is broadly teleological, but claims that while instrumental and strategic actions are oriented 
towards success, communicative action is oriented towards consensus or reaching 
understanding/agreement. 


Habermas’s conception of instrumental and strategic action has come in for much criticism. He tends 
to assume that all instrumental and strategic action is egocentric and monological action, in which 
agents aim to achieve their desired ends, and that they pursue these ends individually, not in concert 
with others. These assumptions are brought out when he contrasts instrumental and strategic action 
with communicative action whereby 

the actions of the agents involved are coordinated not 

through egocentric calculations of success but through 

acts of reaching understanding. (1981 [1984a: 285-6, 

288]; 1981 [1998b chapter 2: 118]; Celikates & Jaeggi 

2009 [2017: 263-4]) 
He also assumes that strategically acting agents achieve their means through influencing others. He 


allows that strategic actors can cooperate, but claims that strategic cooperation is always 
subordinated to the satisfaction of the agents own individual (egocentric) ends. 


Success in action is also dependent on other actors, 
each of whom is oriented to his own success and 
behaves cooperatively only to the degree that this fits 
with his egocentric calculus of utility. (1981 [1984a: 
87-88]) 
Habermas also assumes that in acting strategically an agent adopts an objectivating attitude towards 


others. Strategic actors coordinate their actions with others by trying to influence them or 
manipulate them. 


However, many critics have pointed out that it is not the case that agents who act instrumentally or 
strategically must act selfishly, or monologically, such that they are incapable of stable forms of co- 
operation (Johnson 1991; Heath 2001; Steinhoff 2009; Blau, 2022). These assumptions appear to be 
imported from the “traditional” action theories as Habermas understands them. Not only do these 
assumptions require independent justification, but they are also incidental to the basic distinction 
Habermas is attempting to draw. 


2.2.1 The Unavoidability Thesis 


Having drawn the distinction between communicative and instrumental action, Habermas argues for 
two basic theses. Recall the starting intuition of Theory of Communicative Action: 


The idea of reason, which is differentiated in the 
various claims to validity, is necessarily built into the 
way in which the species of talking animals reproduces 
itself. (1984b [2001d: 85]) 


Habermas claims that in societies like ours, there is no functional equivalent for language as the 
medium of action-coordination and social integration. And if his reconstruction of language use is 
correct that takes place through communicative action: 


the symbolic structures of the life-world can be 
reproduced only through the medium of action 
orientated to understanding. (1982: 237) 


If Habermas’s pragmatic theory of meaning is correct, then a weak transcendental necessity 
transmits from the premises, that communication and discourse are necessary to social reproduction, 
to features of his reconstruction thereof, such as validity claims, rules of argumentation, etc. The 
transcendental necessity in question is weak since the premises are contingent and empirical: they 
are not themselves logically or even physically necessary. And the reconstruction is itself defeasible. 
Nonetheless, the reconstructed features, Habermas contends, are socially necessary for agents like 
us—roughly agents of modern societies. Language exists, and language use is not optional for human 
beings. Linguistic practice presupposes the structures and rules of communication and discourse that 
alone enable communication. So there is no feasible alternative to speaking, making one’s utterances 
understood to interlocutors, and so of raising validity claims, and also, as we will see below, invoking 
the rules of discourse (Heath 2001: 295-98). Validity claims, and the rules of discourse that govern 
the practice of argumentation, are what Habermas calls “pragmatic preconditions” of 
communication. This means that 


from the performative perspective of the participants in 
interaction, these presuppositions must be undertaken. 
(Fultner 2019; Habermas 1999a [2003a: 85-86; 17-18]) 


This is the “universality” expressed in the label “universal pragmatics” which Habermas originally 
gave to his research programme (1976b [1979: 1-68]). 


2.2.2 The Irreducibility Thesis 


Habermas’s theory is that communicative action is the “basic form of action” and that instrumental 
and strategic forms of action are derived from, and parasitic upon, action oriented toward reaching 
understanding (1981 [1984a: 228]; 1999a [2003a: 86ff]; 1976b [1979: 1]). This is a bold and 
controversial claim which overturns the central “traditional” action theory which is that the basic 
form or rationality is means-ends rationality, and that the basic form of action is instrumental, and 
success-oriented. 


Habermas’s argument for the irreducibility thesis rests on his speech-act theory. He claims that 
perlocutionary meaning (roughly, the intended or unintended ends that agents achieve through the 
use of language) depends essentially on illocutionary meaning, namely on the reason-based 
consensus arising from the offer and acceptance of validity claims, but not vice versa. The latter—the 
illocutionary—is the “original mode of language use” upon which he instrumental use of language is 
“parasitic” (1981 [1984a: 288]). In latently strategic uses of language, perlocutionary effects are 
achieved only through the unreserved pursuit of illocutionary aims. In other words, strategically 
acting individuals use language normally, giving their interlocutors to believe mistakenly they are 
aiming at reaching understanding, when they are not (1981 [1998b chapter 2: 118]). 


One problem with this argument is that Habermas recalibrates Austin’s terms “perlocutionary” and 
“illocutionary” so that they are inextricably bound up with instrumental/strategic and communicative 
action respectively from the start. So his argument virtually presupposes what it is supposed to 
explain. Another difficulty is that it is unclear whether the claim that the normal mode of language 
use involves the illocutionary goals of reaching agreement on shared norms and binding practical 
commitments, as in “strong” communicative action, or not. 


For these and other reasons most commentators agree that Habermas fails to establish the 
irreducibility thesis by argument (Baurmann 1985; Steinhoff 2009; Blau 2022). Steinhoff claims 

that all Habermas’s arguments for the claim that communicative rationality is irreducible to 
instrumental rationality fail, but that the reverse is true (Steinhoff 2009: 46) and the traditional 
theory of action is correct. Others think that the basic distinction is useful and can be justified by 
argument, even if Habermas’s own argument is not clinching (Blau 2022; Heath 2001). Heath 
provides an argument in the other direction which shows, contra Steinhoff, that instrumental and 
strategic action cannot account for language use, and argues that this result can support Habermas’s 
contention, that instrumental and strategic action presuppose communicative action but not vice 
versa. (Heath 2001: 45-48) 


2.3 Habermas’s Social Ontology 


The third research program, Habermas’s social ontology, rests on the previous two. 


This takes the form of dyadic analytic distinction between “system” and “lifeworld”, which form the 
ontological counterparts or “complements” of instrument and strategic action, and communicative 
action respectively (1981 [1987: 119]). Societies should be conceived “simultaneously as systems and 
lifeworlds” (1981 [1987: 118]). 


They are theoretical concepts, which enable social theorists to explain social order. 


They are also ontological concepts that respectively denote two different kind of existing social order, 
or to put it another way two distinct but complementary mechanisms of social integration. In the 
former case, systems stabilize the 


non-intended interconnections of action ... by a non- 
normative regulation of individual decision that extend 
beyond the actor’s consciousness. 


In the latter, the lifeworld, social integration is brought about by means of “a normatively secured or 
communicatively achieved consensus” (1981 [1987: 117]). 


2.3.1 Lifeworld 


On the one hand, Habermas presents the lifeworld phenomenologically as “a background stock of 
cultural knowledge that is ‘always already’ familiar to agents” and thus makes mutual understanding 
possible. In this it has what he calls a “peculiar half transcendence” that, unlike the formal notions of 
the subjective, objective, and intersubjective worlds, cannot be objectified and brought before 
consciousness (1981 [1987: 154-4]). On the other hand, he associates it with specific domains of 
social life—such as family life, everyday life, and civil society—in which communicative actions 
predominate, and agents coordinate their interactions by means of speech-acts and their underling 
validity claims (Baxter 2011: 166; Heath 2011: 75). Either way, the lifeworld enjoys a certain primacy 
over the system, since it “remains the subsystem that defines the pattern of the social system as a 
whole” and because systems “need to be anchored in the lifeworld” (1981 [1987: 154)]). 


2.3.2 System 


Habermas developed his notion of the system in his writings prior to Theory of Communicative 
Action, particularly through his engagement with Luhmann (1973a [1975: 1-8]). Systems are macro- 
level processes that stabilize complexes of actions via steering mechanisms. The two main examples 
are the economy and bureaucracy, which function respectively via the steering mechanisms of money 
and power. Unlike the lifeworld, systems fulfil their functions through “a non-normative regulation of 
individual decisions that extends beyond the actor’s consciousnesses” (1981 [1987: 117]). This is a 
reference to an “invisible hand” mechanism of the kind that occurs in Mandeville, Smith, and Hegel, 
with the important difference that in these latter theorists the invisible hand, providence-like, serves 
the common good. For Habermas the function of the systems of the economy and bureaucracy is 
merely to harmonize and stabilize complexes of individual actions, and thereby to bring about 
societal integration and reproduction. They act as relief mechanisms that ease the burden on 
communicatively acting subjects. Of course, in functioning societies the economy and 

bureaucracies should also serve the common good. According to Habermas’s theory this only 
happens when the system bears the right kind of relation to the lifeworld. 


When Habermas first introduced the idea of the system he was engaging with Luhmann’s work and 
thinking mainly of cybernetic systems (1976a [1979: 170]; 1973a [1975: 130-142]). A good example 
of a cybernetic system is a heater linked to a thermostat, where each serves as input and output. If 
the heat rises beyond a fixed temperature, say 20 Celsius, the thermostat switches the heater off, and 
if it drops below that temperature the thermostat switches the heater on again (Heath 2011: 83-84). 
The result (or goal-state to use the slightly misleading technical term) is to keep the room at an even 
ambient temperature. 


Habermas construes systems, in the light of Luhmann, as spheres of “norm free” sociality. “In 
capitalist societies the market is the most important example of a norm-free regulation of cooperative 
contexts” (1981 [1987: 150, 154]). This puts him on the side of those who see markets as destroying 
rather than nourishing the web of moral relations. His predominant way of thinking about 
subsystems is that they are “demoralized”. This is true of the market economy, bureaucracies and 
state administration, and the law, which in Theory of Communicative Action he treats as a subsystem. 
That said, even in this text, while he claims that law in the process of rationalization becomes 
“detached from the ethical motivations of the legal person” (1981 [1987: 174]), he nonetheless thinks 
of basic rights and the principle of popular sovereignty as sources of legitimation which act as 


bridge between a de-moralized and externalized legal 

sphere and a deinstitutionalized and internalized 

morality. (1981 [1987: 178]) 
Systems, which facilitate integration and social order through “delinguistified steering media” like 
money and power, have great advantages for citizens of modern societies. They fulfil functions that 
are too complex or burdensome to be undertaken by communicative action, that is, by individuals 


acting consciously in concert. For example, markets distribute goods and resources to where they are 
most needed, using price signals and laws of supply and demand. 


However, systems also have disadvantages. For one thing, systems, once in place, operate 
independently of human agents. There is, consequently, a gap between an actor’s agency, and their 
conscious intentions and aims, and the purpose that they serve in the system. This lack of 
transparency is evident in firms, for instance, where the agents fulfil their roles and tasks, whether 
using instrumental, communicative, or moral rationality, or a mix of all, while all the time behind 
their backs or “beyond their consciousness” at the macro-level they are making profit for the firm’s 
owners and shareholders. For another, Habermas claims, agents operating in spheres steered by 
delinguistified media are inclined to shift from communicative to instrumental and strategic action 
orientations with the result that 


success-oriented action steered by egocentric 
calculations of utility loses its connection to action 
oriented by mutual understanding. (1981 [1987: 196]) 


Whether Habermas holds that agents’ actions in economic and bureaucratic domains are merely 
constrained by system imperatives of the relevant steering media, or reduced to instrumental and 
strategic actions is moot (Jitten 2013). But it is empirically false for reasons given by Honneth and 
Joas and others that the mediatization of a domain of social life would force agents to adopt only one 
type of action. As Joas puts it, every sphere of action contains “a wealth of different types of action” 
(Joas 1986 [1991: 104]). Systems, economic and bureaucratic, and the specific organisations they 
comprise, all involve numerous different kinds of action. This is not only an empirical claim but a 
conceptual one that follows from Habermas’s own theory that instrumental and strategic action is 
parasitic on communicative action. Habermas has also been criticized for being seduced by systems 
theory into merely accepting spheres of norm free sociality, and the uncoupling of system and 
lifeworld as a normal result of modernization and social differentiation (McCarthy 1991). 


2.3.3 The Relation of Lifeworld to System 


The relation of lifeworld to system is pivotal to Habermas’s social theory. Most commentators (for 
example McCarthy 1991: 154 and Baxter 2011: 166) take this to be the main problem facing the 
theory. Actually, as Habermas notes, there are two related problems: the problem of constructing a 
theory that can combine systems theory and action theory fruitfully (1981 [1987: 201]), and the 
problem of articulating the actual relation between system and lifeworld. 


So what is that relation? Habermas accords primacy to the lifeworld, on the on the grounds that it 
“defines the pattern of the social system as a whole”. Systemic mechanisms, he contends, “need to be 
anchored in the lifeworld” (1981 [1987: 154]). It is tempting to think that Habermas’s thesis of the 
primacy of the lifeworld over the system must have to do with the primacy of communicative over 
instrumental and strategic action, the view that instrumental and strategic action are “parasitic” on 
communicative action, but not vice versa. However, that would be problematic since Habermas does 
not succeed in establishing by means of speech-act theory that communicative action is the basic 
form of action on which instrumental and strategic action essentially depend. 


Even if Habermas had an argument that conclusively demonstrated the primacy of communicative 
over instrumental action, that would not suffice to establish the primacy of the lifeworld over the 
system. The primacy of one kind of action over another could not itself establish the primacy of one 
kind of social order over another. That would be a conflation of levels. Habermas does in fact quite 
often conflate types of action with spheres of action, as we see from his claim cited above that 
success-oriented individual actions are “steered” by egocentric calculations (Joas 1986 [1991: 104]). 
For example, he talks of “subsystems of purposive rational action” and “normative steering” (1981 
[1987: 180-1]). 


2.4 Habermas’s Critical Social Theory 


Habermas’s thesis of the primacy of the lifeworld over the system gives him a framework in which to 
criticize neo-liberalism and its mania for the marketization and financialization of everyday life: “the 
whole program of subjecting the lifeworld to the imperatives of the market must be subject to 
scrutiny”, whether in health care, public transport, military security, or secondary and tertiary 
education (2009: 186). He does this with his theory of the colonization of the lifeworld by the system. 


2.4.1 Colonization 


Recall that mediatization—the intrusion of steering media into social domains that were hitherto not 
systematically organized and integrated—is not, according to Habermas, itself a pathological 
development. It has a positive side, namely as a relief mechanism, a way of coping efficiently with 
complexity. But colonization, as the label suggests, is bad. Colonization occurs where 


systemic mechanisms suppress forms of social 
integration even in those areas where a consensus- 
dependent coordination of action cannot be replaced, 
that is, where the symbolic reproduction of the 
lifeworld is at stake. (1981 [1987: 196]) 


Symbolic reproduction is where society is maintained through communicative actions. As a 
consequence of colonization, the lifeworld shrinks, and its capacity for symbolic reproduction 
atrophies (1981 [1987: 154, 173]). This is functionally bad. To the extent that the system, and society 
as a whole, depend on the lifeworld and its communicative resources, colonization is self-stultifying 
and destabilizing. 


Habermas follows Marx and Weber on this, rather than Luhmann, insofar as he shows that 


the rationalization for the lifeworld makes possible the 
emergence and growth of subsystems whose 
independent imperatives turn back destructively upon 
the lifeworld. (1981 [1987: 186]) 


This not only eradicates normative contexts of action but, he maintains, supplants these with 
instrumental and strategic action orientations, which he calls a “pathological de-formation of the 
communicative infrastructure of the lifeworld” (1981 [1987: 375, 180, 181]). This recalls Adorno’s 
discussion of bourgeois coldness, but broadens and extends it and decouples it specifically from the 
moral abomination of Auschwitz (Adorno 1966 [1973: 363]). For example, consider that once 
healthcare and housing are privatized and seen as mere commodities, landlords can evict tenants 
because of rent arrears, and hospitals can turn away patients who cannot afford to pay, with no moral 
qualms or social responsibility, because they see these actions as merely economic transactions. 
Colonization is also bad, according to Habermas, insofar as it leads to a variety of social pathologies 
pertaining to culture, society, and person, including loss of freedom, loss of meaning, crises in 
legitimation, anomie, and alienation (1981 [1987: 385; 142-3]). 


2.4.2 Reification 


Habermas presents his colonization thesis as a reformulation of the Lukacsian idea of reification that 
was so influential on the first generation of Frankfurt School critical theorists, especially Adorno 
(1981 [1984a: 399]; 1981 [1987: 1]). Lukács argues that the commodification of all domains of 
modern society, and consciousness thereof, has led to its “taking on the character of a thing”, hence 
the economy and bureaucracy and the legal system appear to subjects as a “second nature” (Lukacs 
1971: 83, 86). Consequently they adopt a passive attitude towards it, in theory as contemplation, in 
practice as adaptation. This results not only in an illusion, but also in a kind of inaction, since people 
see and treat things as natural, and so not up to them, and not alterable, when such things are in fact 
historical and in principle reversible, and up to them. That idea needs reformulating, Habermas 
maintains, because it mistakenly conflates rationalization and social differentiation with reification. 
Lukacs, for example, thinks of rationalization as the destruction of social totality, namely the 
substantial ethical life of a community, as an unqualified bad, whereas Habermas sees it as having 
upsides and downsides. 


Reification is the major downside, which Habermas sees as a specific alienating effect of the 
destruction of the communicative capacity for symbolic reproduction and social integration provided 
by the lifeworld. In particular, reification arises when the intrusion of steering media causes the 
“conversion to another mechanism of action coordination” (1981 [1987: 375]). Since this conversion 
happens as it were behind the backs of social agents rather than through their conscious intentions 
and choices, colonization gives rise to “objectively false consciousness” and its effects “have to 
remain hidden” (1981 [1987: 187]). Notably, contra the Marxist tradition up to Lukacs, the reification 
effect arises in class-unspecific ways and thus the crises do not precipitate class conflict, which, 
Habermas argues, the welfare state has largely succeeded in pacifying (1981 [1987: 187]). 
Furthermore, though reification effects are “filtered through the pattern of social inequality”, the 
latter is not one of the pathologies he focuses on (1981 [1987: 349]). He focuses more on welfare 
state clientism, and juridification (1981 [1987: 357, 363-4]). This is one of the important differences 
between Habermas’s social theory and Rawls’s Theory of Justice, and leads to a completely different 
outlook in the diagnosis of social ills (Jütten 2011). 


2.4.3 Normative Grounds 


This brings us to the third significant feature of Habermas’s critical social theory: the problem of 
normative grounds. In the introduction to Theory of Communicative Action Habermas describes his 
project as “not a meta-theory but the beginning of a social theory concerned to validate its own 
critical standards” (Schnadelbach 1986 [1991: 8]), inviting the contrast with the critical “social 
theory” developed in the mid twentieth century by Horkheimer and Adorno, which, according to 
Habermas’s criticism, had foundered on “the difficulty of giving an account of its own normative 
foundations” (1981 [1984a: 374]). He also criticizes systems theory for failing to see the primacy and 
fragility of the lifeworld, its function of embedding and facilitating all spheres of action, and its 
crucial role in producing and reproducing social order. Systems theory conflates system and social 
integration and “deprives itself of the standard of communicative rationality” (1981 [1987: 186]). 


This has led commentators to assume that the ideas of communicative action together with the 
theory that in modern society socialization takes place through communicative action provide 
Habermas with an account of normative foundations, of the kind he claimed was missing from earlier 
Frankfurt School theory. The trouble is that it is unclear to what extent this is a functionalist theory, 
supported by mainly empirical claims, or a normative one; and, if normative, in what sense. As 
Herbert Schnadelbach was among the first to point out, Habermas’s approach of rationally 
reconstructing the communicative infrastructure of modern societies does not obviously fit the bill 


here. Functionalist explanation is not normative justification (Schnadelbach 1986 [1991: 21]). This 
also applies to his reconstructed theory of reification Jutten 2011). It is clear that Habermas’s is a 
normative theory, to the extent that if his diagnosis is correct, the “damage” done by colonization is 
not only social and structural, individuals too are harmed and suffer as a result of it, and their 
legitimate expectations are disappointed. This is the case whether or not they are “wronged”. The 
problem can be put as a dilemma. Either Habermas’s social theory is genuinely critical, in which case 
it judges that a colonized lifeworld is bad and ought to be changed (and such judgement requires 
substantive normative premises), or it stops short of such a judgement, and is hence not properly 
critical. 


Maeve Cooke argues that Habermas’s theory contains implicitly a “utopian promise” of a rationalized 
lifeworld, that is “reproduced through processes of intersubjective evaluation of validity claims” 
(Cooke 1994: 162), so that substantial ethical reasons ground its normative claims. Other 
commentators, such as Honneth, assume that discourse ethics provides the account of its normative 
foundations that, according to Habermas, earlier critical theory stood in need of (Honneth 1985 
[1991: 282]). However, there are numerous problems with this view. One is that the normative claims 
of critical theory stand in need of substantive moral reasons, rather than accounts oł those reasons 
(Finlayson 2013a), and as Schnadelbach points out it is not clear that Habermas’s rational 
reconstruction of communication as discourse can provide such reasons. 


3. Discourse Ethics 


Discourse ethics was developed contemporaneously with the theory of communicative action and fits 
into roughly the same framework. Discourse ethics comprises a number of different interlocking 
theories: 


the lineaments of a social theory of morality 

a normative moral theory 

a philosophical justification of the moral standpoint (the principle of universalization) 
a theory of the development of moral consciousness 


Unlike the Theory of Communicative Action there is no single work in which Habermas’s discourse 
ethics is given a settled and canonical statement. Instead, it is an evolving research programme, 
presented in a series of essays most of which are contained in two volumes published in English 

as Moral Consciousness and Communicative Action (1990a) and Justification and Application (1993). 
The development of Habermas’s discourse ethics falls into two phases, roughly 1983-1990 and 1991- 
1996. Broadly speaking, the essays in Moral Consciousness and Communicative Action comprise 
phase one, and those from Justification and Application through to Between Facts and Norms, phase 
two. We will call the theory developed in phase one “discourse ethics”, and that in phase two, 
“discourse morality”. For in the beginning of the 1990s Habermas introduces a distinction between 
morality as a normative theory of rightness, and ethics as a theory of the good, in which light, he 
himself later notes, to be accurate he should have renamed his theory “the discourse theory of 
morality” (1993: 1; 1991: 7). 


3.1 Discourse 


Discourse is not a synonym for language or speech. Only speech that is explicitly oriented towards 
reaching rationally motivated consensus counts as discourse (1981 [1984a: 42]). In other words, 
discourse is communication, for Habermas stipulates that communication just is speech oriented 
towards reaching understanding. But discourse is communication of a special kind. When 
communication breaks down and the everyday shared meanings and understandings that normally 
coordinate interactions are disrupted, interlocutors have to switch from action to discourse. 
Discourses are a “reflective form of agreement-oriented action that ...sit on top of the latter” (1986a 
[1990b: 245-6]). Discourse is a higher order of communication, with the aim of renewing or replacing 
a problematized consensus. To participate in discourse just is to provide reasons with the aim of 
convincing all interlocutors to accept a disputed validity claim. So, on Habermas’s view, discourse 
just is the language game of argumentation in which disputed validity claims are “redeemed”, and 
when a validity claim is redeemed successfully the disrupted consensus is restored, renewed, or 
replaced by a new one. 


On this picture, argumentation is a rule-governed practice, not a verbal free-for-all. Habermas 
identifies three levels of rules. First, there are the basic rules of logic such as the principle of 
contradiction, and the basic semantic rules of universalizability and consistency (1983 [1990: 86]). 
Second, there are procedural norms such as the principles of sincerity and accountability, namely 
that every participant must undertake, if only implicitly, to assert only what she genuinely believes 
and always either to justify upon request what she asserts or to provide reasons for not offering a 
justification. These are, contends Habermas, preconditions for all genuine argumentation, i.e., 
argumentation conceived as a “search for truth” and organized like “a competition for better 
arguments” (1983 [1990: 87-8]). Third, there are the processual preconditions that immunize 
discourse against coercion, repression, and inequality. These norms are supposed to insulate 
discourse from all persuasive forces except the “unforced force of the better argument”, and they 
must be followed, if a rationally motivated consensus is to be reached. 


Habermas suggests that the following rules of discourse can be established: 


(1) 
Every subject with the competence to speak and act is 
allowed to take part in a discourse. 


(2) 

a. 
Everyone is allowed to question any assertion whatever. 

b. 
Everyone is allowed to introduce any assertion whatever into 
the discourse. 

G. 


Everyone is allowed to express his attitudes, desires, and needs. 
(3) 
No speaker may be prevented, by internal or external 
coercion, from exercising his rights as laid down in (1) and 
(2) (1983 [1990: 89]; Rehg 1994: 62). 


The above list is not intended to be complete. And Habermas nowhere provides a complete list. They 
are a representative sample borrowed from Robert Alexy’s Theory of Practical Discourse (Alexy 1978 
[1990: 165-7)). 


3.2 Performative Self-Contradiction and 
Transcendental Pragmatic Justification 


Habermas assumes that the rules (1), (2) a-c, and (3) above can be identified by a test of 
performative self-contradiction. A performative contradiction arises when a rule that speakers 
pragmatically invoke by the illocutionary act of, say, assertion, is contradicted by the semantic 
content of that assertion. An example is Moore’s paradox: “It is raining but I don’t believe it” (Moore 
1993). Habermas maintains that the test of whether the denial of a rule yields a performative 
contradiction does not justify rules of discourse so much as identify them (1983 [1990: 95]). On that 
point, he differs from his colleague Karl-Otto Apel, who in his seminal 1973 paper attempts what he 
calls a transcendental-pragmatic, “ultimate” justification of the norms of a minimal ethics, by 
showing how they follow directly from the pragmatic preconditions of communication (Apel 1976 
[1980]). According to Cristoph Lumer, Habermas made the same attempt in the original manuscript 
of Moral Consciousness and Communicative Action, though abandoned it in subsequent versions 
(Lumer 1997: 50). Later, as Heath (2014: 843) notes, Habermas denies that his principle of 
universalizability (U)—see §3.3 below—can be justified through a “performative contradiction 
argument”, not only because applying the test of performative contradiction to a rule is in his view 
heuristic, not justificatory, but also, because he thinks, no such principles and certainly no 
substantive moral norms follow directly from those rules. 


3.3 The Principles of Discourse Ethics and 
their Justification 


The two principles of discourse ethics are principle (D) and principle (U), the principle of 
universalizability. In phase one, Habermas formulates (D) as follows: 


(D) 
Only those norms can claim to be valid that meet (or could 
meet) with the approval of all in their capacity as participants 
in a practical discourse (1983 [1990: 66)]). 


He initially calls (D) “the principle of discourse ethics” (1983 [1990: 66]). It takes the form of a 
straightforward validity to consensus (or possible consensus) conditional. The scope of the consensus 
is “all those affected” by the norms. As such it is universal among participants in practical discourse, 
namely among all agents. 


In addition, there is principle (U)—the principle of universalization, which spells out the “criterion for 
generalizing maxims of action” (1983 [1990: 62]). Every valid norm has to fulfil the following 
condition: 


(U) 
All affected can accept the consequences and the side effects 
its general observance can be anticipated to have for the 
satisfaction of everyone’s interests (and these consequences 


are preferred to those of known alternative possibilities (1983 
[1990: 65]). 


Habermas first introduces (U) as “a rule of argument that makes agreement in practical discourses 
possible” (1983 [1990: 66]). In phase one he begins with a justification of (U), and then infers (D) 
(1983 [1990: 93]). The initial idea was that (U) was to be given a “transcendental pragmatic” 
derivation from the rules of discourse, and (D) inferred from (U). The nature of the putative transition 
between (U) and (D) however remains unclear (Lumer 1997: 49), and Habermas has not succeeded in 
clarifying it. 


In the 1983 programme, after having distanced himself from Apel’s programme, Habermas proposes 
to justify (U) by deriving it logically from two premises: (1) the rules of discourse and (2) “the idea of 
the justification of norms” or “a weak idea of normative justification” (1983 [1990: 92, 97]). Later, in 
phase two, Habermas reverses the order of justification, and infers (U), now designated the “moral 
principle”, from (D). At the same time, he weakens (D) by making it apply generally to all norms of 
action whether moral or not, and labels it “the discourse principle”: 


(D) 
Only those action norms are valid to which all possibly 
affected persons could agree as participants in rational 
discourse (1992b [1996b: 107). 


(D) now merely “expresses the meaning of post-conventional requirements of justification” (1992b 
[1996b: 107]), and is given along with premise (2). It supposedly merely explicates “the point of view 
from which action norms can be impartially grounded” (1992b [1996b: 109]). While Habermas 
weakens (D) he strengthens (U). 


a norm is valid if and only it the foreseeable 
consequences and side effects of its general observance 
for the interests and value-orientations of each 
individual could be freely 

accepted jointly by all concerned. (1996a essay 1: 60 
[1998a: 42 translation amended]) 


Neither Habermas’s followers, with the possible exception of Rehg (1991 & 1994), nor any of his 
critics, think that this logical derivation of (U) goes through. Gradually, Habermas backs away from 
the claim that (U) can be given a logical or formal derivation. He presents discourse ethics as a 
“programme” of philosophical justification and keeps adding stronger premises to that programme 
(Lumer 1997: 53; Habermas 1983 [1990: 43]). According to Kettner the fact that in phase two (D) is 
no longer the principle of discourse ethics, but construed as “still neutral” with regard to morality 
and law, together with the fact that (U) is now the moral principle, means that the grounding of 
discourse ethics is normatively incomplete, and this effectively amounts to the “disappearance” of the 
original project, which was supposed to show that the moral principle and the moral point of view fall 
out of the success conditions for communicative action (Kettner 2002: 207, 211-12). 


To sidestep the absence (or failure) of a logical derivation of (U), Habermas presents (U) 
provisionally as an abductive hypothesis (1996a essay 1: 60 [1998a: 42]) where abduction is an 
inference to the best explanation. Ott makes an attempt to justify (D) and (U) as a conjunction of 
pragmatic implications (Ott 1996: 42-45), while Finlayson claims that U is best thought of as an 
inference to the best explanation, and both agree that it rests on supplementary premises drawn 
from Habermas’s modernization theory (Finlayson 2000; Ott 1996). This raises the question of what 
kind of justification of the moral standpoint Habermas has in mind. In Moral Consciousness and 
Communicative Action he presents discourse ethics as a defence of a cognitivist account of morality 
against the moral sceptic on grounds that even the sceptic must accept (1983 [1990: 76ff]). The 
philosophical programme of justification looks like a constructivist—and what Gunnarsson calls a 
“rationalist”—argument, invoking slender premises and reaching thick normative conclusions, albeit 
the said normative conclusions are not substantial moral norms, but a moral principle of 
universalization, and an account of the moral point of view, in which moral agents (participants in 
discourse) collectively validate substantial moral norms (Gunnarsson 2000: 99, see also Rees 2020: 
678-81, 689-91). Gunnarsson argues that all Habermas’s rationalist arguments for (U) fail. Benhabib 
is also of this view. In an early critique she argued that (U) was redundant, and that (D) should be 
given a “weak justification programme” which adduced supplementary moral premises of “universal 
respect” and “egalitarian reciprocity” (Benhabib 1986: 308; 1992: 31). Habermas’s tendency to add 
ever stronger supplementary premises for his justification of (U) is a tacit acknowledgement of his 
failure to provide either a formal derivation or a “rationalist” justification of (U). In doing so he moves 
away from his original idea of providing a philosophical justification of (U), towards a more modest 
explication and reconstruction of the moral point of view, which can be corroborated by insights 
drawn from an array of disciplines, including the moral psychology of Lawrence Kohlberg. 


3.4 (U) and Kant’s Categorical Imperative 


Principles (D) and (U) are supposed to capture the practice of universalization in ethics, but ina 
social sense that differs from apparently similar procedures in Rawls, whom Habermas takes to be 


working in the Kantian tradition of political philosophy. The differences are as follows. (U) is not 
presented as a moral norm: it is not formulated as an imperative. Initially, (U) is presented as “a 
bridging principle” on analogy with the principle of induction, except that (U) tests whether norms 
are amenable to consensus among all affected, given their interests. (U) is not a hypothetical test for 
generating moral norms. It rationally reconstructs an actual practice by which real moral agents, as 
participants in an actual discourse, themselves ascertain the validity of moral norms, or institute 
them. Finally, whereas, according to Habermas, the test of universalization contained in Kant’s first 
formulation of the Categorical Imperative is “monological”, since it can be performed by each person 
individually, the procedure captured by (U) is “dialogical” since it must involve other people acting 
together. As McCarthy puts it: 


Rather than ascribing as valid to all others any maxim 
that I can will to be a universal law, I must submit my 
maxim to all others for the purposes of discursively 
testing its claim to universality. The emphasis shifts 
from what each can will without contradiction to be a 
general law, to what all can will in agreement to be a 
universal norm. (1983 [1990: 67], Habermas quoting 
McCarthy 1978: 326) 


Finally, on the Kantian conception a maxim is adopted in virtue of its universal form, whereas on 
Habermas’s conception it is not just the form of the maxim that is universalized. It is in part 

the content of the norm that is universalized. This is why (U) refers to the “interests” of everyone 
affected it. But on Habermas’s view universalization is thought of as a social process that extends to 
the very self-conception of the moral agent. Habermas’s guiding idea here is G. H. Mead’s notion of 
“ideal role taking” (1988b: chapter 3 [1992a]; 1983 [1990: 56-68, 121]), in which moral agents learn 
by projecting themselves into the position of all other moral agents. This is a process whereby agents 
consider “every interest involved” and evaluate what is “good for everyone under the same 
conditions”, a process that, according to Mead, leads “to the development of a larger self, which can 
be identified with the interests of others” (Mead 1934 [1962: 363]). 


3.5 The Return of Habermas’s Kantianism 
and the Morality/Ethics Distinction 


Habermas thus develops a distinctively social and intersubjective version of ethics, different from 
Kant’s. But certain developments in phase two push back toward a Kantian conception of discourse 
ethics (Heath 2014: 846). In Between Facts and Norms, (U) is said to allow participants to agree on 
norms that are impartial, and that have “categorial validity” (1992b [1996b: 28]). And Habermas 
appears to hold, like Kant, that the rational will is the source of moral obligation (1992b [1996b: 
110]; Kettner 2002, 212). Furthermore, the moral domain, which in phase one encompassed the 
entire domain of action norms and values, later “shrinks” to a domain of thinner, universal moral 
norms with characteristic overridingness and stringency (1991a chapter 6, 202 [1993: 91]). 


It is primarily Habermas’s introduction of a tripartite typology of practical reason—the ethical, moral, 
and pragmatic employments of practical reason—that drives his return to Kant in Between Facts and 
Norms. In particular Habermas introduces a notion of ethical discourse that differs from moral 
discourse in that it addresses clinical questions such as “What is the good life (for me, or for us)?” 
and that is restricted in scope to communities with shared values. Ethical discourses differ, he 
claims, from moral discourses which deal with questions of justice, understood as what is equally in 
the interests of all (1983 [1990: 180]; 1991a chapter 5, 101-105 [1993: chapter 1, 3-8]; 1996a essay 
1: 39-40 [1998a: 26-7]). He understands “justice” not as a political value, as the later Rawls does, 
but as the supreme moral value, and the designated value of moral discourse, on analogy with truth 
for theoretical discourse. Although he allows that many questions can be addressed by both ethical 
and moral discourses, he draws a sharp distinction between them and accords priority to morality 
over ethics (1983 [1990: 104]; 1991a chapter 5, 100-119 [1993: chapter 1, 1-17]). Ethical discourses 
always take place within the bounds of moral permissibility, and on Habermas’s account no moral 
norm can be weighed against, or trumped by, any ethical value. Along with various communitarians, 
Charles Taylor and Martin Seel both criticize Habermas on this point and defend the idea that the 
good (albeit construed in different ways) has priority over the morally right (Taylor 1989 [1992]; Seel 
1995). 


The ethics/morality distinction has also been a target of much criticism. Benhabib (1986), McCarthy 
(1991; chapter 7: 181-200); Putnam (2002); and Kettner (2002), among others have shown that 
Habermas cannot and does not maintain a sharp distinction between the two. The main reason they 
give is that the notion of value that Habermas associates with ethics bleeds into his conception of a 
valid moral norm. Indeed, Habermas himself acknowledges the existence of what he calls a “hidden 
link” between justice and the common good (1983 [1990: 202]), and “the remnant of the good at the 
core of the right” (“das Gute im Gerechten”) (1996a essay 1: 43 [1998a: 29]). And he appears to think 
of this relation, not as threat to his conception of morality but as a feature of it, namely that it rests 
on an underlying web of solidary relations between human beings. Habermas claims that morality 
arose from this web of solidarity as a process of generalization and universalization in the course of 


modernization, and, he claims, although solidarity remains as the other side of justice (qua moral 
rightness) this does not smudge the bright line between ethics and morality (Habermas 1986a 
[1990b]). 


Kettner objects that Habermas’s typology is dogmatic, established by “terminological fiat” and that it 
should not be seen as an insight into the nature of practical reason (Kettner 2002: 208). Few critics 
are prepared to defend Habermas’s distinction with the notable exception of Rainer Forst (Forst 
2007 [2011: 60-79]). In the final analysis, Habermas settles on the view of discourse ethics in 
general, and (U) in particular, as a reconstruction of an alternative to Kant’s first formula of the 
Categorical Imperative that is allegedly superior to the latter in virtue of being dialogical rather than 
monological and the reconstruction of an actual practice involving real moral agents. 


3.6 Dialogical vs. Monological Morality 


Much depends on Habermas’s articulation and defence of the distinction between a dialogical and a 
monological ethics, and his argument for the superiority of the former, which he makes with 
vehemence and conviction both in his critical engagement with Lawrence Kohlberg and later in his 
debate with Rawls. Nonetheless, critics have argued that the putative distinction between dialogical 
and monological ethics is one without a difference, and have cast doubt on Habermas’s argument for 
the cognitive superiority of dialogical ethics. McMahon argues that (U) is ambiguous between two 
very different conceptions of dialogicality. Weak dialogicality consists in the concurrence of 
independent judgements about which norm would satisfy the interest of all concerned, that is put 
together in piecemeal fashion (McMahon 2000: 521). Strong dialogicality, by contrast, involves a 
collective, joint judgement by all affected. McMahon claims that Habermas, and Rehg (1991 & 1994), 
whose interpretation of (U) Habermas cites approvingly (1992b [1996b: 109]), endorse the strong 
conception. But the strong conception, which requires that each participant in discourse suspend 
judgement until all other have cast their vote, is, so to speak, incoherent. For one thing, it robs each 
participant of any reason to judge from their own perspective whether a norm is valid. For another, it 
renders deviation from a norm impossible, because as soon as an agent deviates from a norm, it is no 
longer valid, and cannot be criticized as mistaken (McMahon 2000: 529). 


Another reason to think that the distinction is without a difference is that Habermas allows 
participants in an actual discourse to conduct advocatory discourses whereby they imaginatively 
project themselves into other people’s points of view. He must allow this because it would be 
impossible to conduct an actual discourse with “all affected” by a norm. So Habermas must allow that 
the constituency of all participants in an actual discourse might be small, at the limit two people, and 
that the constituency of “all affected” by a norm might consist of every moral agent and patient 
including the unborn. But with that degree of idealization, the difference between an ideal discourse 
conducted monologically and an actual but advocatory dialogue between real people has all but 
disappeared, and the cognitive superiority of the latter cannot be maintained. 


3.7 Discourse Ethics and Critical Social 
Theory 


Habermas claimed in Theory of Communicative Action that Adorno and Horkheimer’s critical theory 
failed to provide an adequate account “of its own normative foundations” (1981 [1984a: 374]) and 
that his own project was, by contrast the “beginning of a social theory concerned to validate its own 
critical standards” (1981 [1984a: xxxix]). 


In Legitimation Crisis Habermas had explored the idea that critical social theory could find its point 
d’appui in “suppressed generalizable interests”, namely the unrealized rational potentials of modern 
society (1973a [1975: 111]). As we saw, and Habermas himself admits, he did not develop this 
approach in Theory of Communicative Action. He claimed that critical theory must refrain from 
“critically evaluating ... forms of life and cultures ... as a whole” and focused instead on investigating 
the way in which communicative pathologies arising from the colonization of the lifeworld hinder 
“learning potentials” (1981 [1984a: 383]). Critics such as Schnadelbach and Taylor countered that 
such an approach would only explicate critical or rational potentials, but never justify normative 
criticisms of society (Schnadelbach 1986 [1991]; Taylor 1986 [1991]). Recall that (U) is a moral 
principle that is supposed to validate norms containing generalizable interests. Some of Habermas’s 
supporters in the 1980s, for example, Honneth and Benhabib, argued that discourse ethics could 
serve as the account of the normative foundations that Habermas argued first generation critical 
theory was missing (Honneth 1985 [1991: 286]; Benhabib 1986: 279-81). 


The trouble is that Habermas’s social theory does not set out to criticize society in virtue of its 
degree of conformity to principle (U) or to any substantive moral principle. Habermas continues to 
deny that this is the proper approach of social theory. He explicitly rebukes Rawls for thinking that 
the theorist can criticize society by first designing “the basic norms of a well-ordered society on the 
drafting table” and then checking society against it (1990d [1994: 101]; 1996a essay 3: 122 [1998a: 
97]). That approach arrogates the task of criticism to the philosopher, instead of social agents and 
citizens. Besides which, the problem, as Schnadelbach understands it, is that social criticism proper 
requires normative judgements, which must be supported by normative reasons, while discourse 
ethics limits itself to the clarification of the moral point of view, and (U) is a procedural principle that 
leaves such judgements up to participants in discourse, and cannot furnish the requisite reasons 
(Finlayson 2013b). 


3.8 Other Criticisms 


In the 1980s and 1990s a number of feminist critics, inspired by Carol Gilligan’s critique of 
Kohlberg, In a Different Voice, developed two significant lines of argument against Kantian moral 
theory and Rawlsian liberalism. The first was that the “liberal” conception of the moral self is formal, 
abstract, and gender-neutral. The second was that the moral self is a male self masquerading as 
neutral and universal. The suggestion is that morality/moral theory is complicit with discrimination 
against women and patriarchal oppression. They aimed a similar criticism at Kantian, Rawlsian, and 
Kohlbergian conceptions of “the moral standpoint” (Benhabib & Cornell [eds] 1987a; Benhabib 1992; 
Meehan [ed.] 1995). Among these feminist critics, some, e.g., Benhabib and Meehan, were initially 
supportive of discourse ethics because of its emphasis on including other voices, and its insistence its 
ideals must be won from the reconstruction of actual practices of real agents. Later, their arguments 
were recalibrated and turned against discourse ethics. Because the moral self is conceived as formal 
and abstract, and because the moral point of view is characterized by the formal criteria of 
universalizability and reversibility, Habermas’s discourse ethics cannot accommodate the kind of 
moral experiences that, Gilligan maintains, are characteristic of women: it is blind to the importance 
of considerations of care and responsibility for others. This leads Benhabib claims to a privatisation, 
personalisation, and devaluation of women’s moral experiences (Benhabib & Cornell 1987a: 7-9; 
Benhabib 1992: 152, 184). She argues further that “the restriction of the moral domain to questions 
of justice” results in “the privatization of women’s experience and leads to epistemological blindness 
towards the concrete other” (Benhabib 1992: 164). There is some doubt whether Benhabib’s position 
amounts to a refutation of Habermas’s theory or an explication of it. The universalization procedure 
imposed by discourse involves participants in moral discourse imaginatively switching perspectives 
with concrete other people. If it were not so, if “ideal role taking” demanded only that one examined 
candidate norms from the perspective of others generally conceived, there would be nothing to gain 
from such a procedure (Finlayson 2013a). It is also unclear, whether such criticisms apply to 
Habermas’s moral theory, or to the actually existing Kohlbergian “Stage 6” morality he takes as its 
object. Second, it is unclear whether the criticism shows merely that discourse ethics is a flawed 
moral theory, or whether it shows that, by dint of these flaws, the theory (or the morality it theorizes) 
perpetuates discrimination or oppression against women. 


Finally, discourse ethics together with the theory of modernity in Theory of Communicative 

Action has been criticized from an anti-colonial perspective. The most sustained and detailed 
criticism comes from Amy Allen. Noting that Habermas’s argument for principles (D) and (U) rests on 
assumptions drawn from modernization theory, she brings the charge of Eurocentrism. Furthermore, 
building on criticisms by Dussel (1993) and Bhambra (2011) among others, she argues that 
Habermas’s discourse ethics is wedded to a theory of social evolution and the idea of modernization 
as a learning process, which commits him to a “progressive view of history” (Allen 2016: 72-3). In 
spite of his attempt to deflate the presuppositions of the philosophy of history that weighed on 
Western Marxism, Allen and Dussel argue that Habermas’s theory is freighted with a dogmatic 
Hegelian universalism that in the final analysis asserts the developmental superiority of European 
modernity. 


4. The Discourse Theory of Law and 
Democracy: Between Facts and Norms 


Between Facts and Norms is a legal and political theory focused on the ways in which constitutional 
democracies produce and institutionalize democratically legitimate law. As Habermas wrote and 
researched the book in the late 80s and early 90s, political theory was in the grip of the debate 
between liberalism of a Rawlsian or Kantian stripe, and communitarianism, or as Habermas 
preferred to call it Neo-Aristotelianism. In Between Facts and Norms he attempts to mediate between 
these two opposed approaches to political theory. He does this by setting out and defending the 
thesis of the equiprimordiality (or co-originality) of liberal rights and popular sovereignty, and their 
correlative values of individual and collective autonomy. 


One of the basic ideas in Between Facts and Norms is that the rule of law, and indeed—though this is 
implied not stated—legitimate liberal constitutional states, cannot exist without radical democracy 
(1992b [1996b: xlii]). But at the same time radical democracy has to be made compatible with the 
exigencies of large scale administrative and bureaucratic states organized through law. At the same 
time Habermas also tries to locate and defend the common ground between legal positivists, like 
Austin, and normativists like Dworkin and Rawls. He does this by identifying and rationally 
reconstructing the “normative self-understanding” of the legal system as embodied in 


particles and fragments of an “existing reason” already 
incorporated in existing practices. (1992b [1996b: 287]) 


Habermas does not oppose the ideal, but attempts to reconstruct idealizations embodied in existing 
practices, and set these in play without relying on the optimistic providential assumptions of the 
philosophy of history. The practices he has in mind are legislative practices, and the “particles of 
existing reason” consist in the various ways in which discourse is built into those practices. In short 
Habermas reconstructs and describes the ways in which discourse is institutionalized by political and 
legal systems. Thus his approach is a hybrid between the sociology of law and jurisprudence, and 
normative political philosophy. The approach is captured by the title Faktizitat und Geltung (literally 


“Facticity and Validity”), or Between Facts and Norms where the “between” designates a complex set 
of interrelations, rather than a middle ground. 


4.1 The Two-Tracked Theory of Democracy 


Habermas’s conception of democracy has been called a “two track” theory (Baynes 1995). That said, 
it is more like a “two complex” conception, since the “tracks” in question designate the formal and 
informal public spheres and each of these is a complex. At the centre is the parliamentary complex, 
not only parliament but the administrative and judicial bodies accompanying it. Parliament itself is a 
formal “public” forum that is legally established and organized to take decisions (1992b [1996b: 
355]). 


It is surrounded by and embedded in an “informal” public sphere, an “open and inclusive network” of 
various kinds of discourse—moral, ethical, and pragmatic—that form “a ‘wild’ complex” that is not 
formally organized, even though each form of discourse has its own internal discipline. (1992b 
[1996b: 307]). (Habermas also calls the informal public sphere “civil society” to indicate that it is not 
legally or political regulated.) When deliberative democracy works as it should, discourse and its 
outputs—moral norms, values, and more broadly public opinion—percolate into the parliamentary 
complex through a system of sluices and channels (1992b [1996b: 355]). These inputs are then 
worked up in parliamentary discussion and debate and eventually embodied in the form of laws and 
policies and returned as outputs into society at large. As they have been shaped by public opinion 
and shared moral values, and worked up through debate, they find acceptance by citizens on the 
basis of the reasons they embody. That is how, when things go well, according to Habermas’s theory, 
legitimate law is produced. Roughly, this is Habermas’s account of the production of the “validity” of 
modern law. 


This model can also be thought of in terms of the circulation of communicative power from periphery 
to centre, whereby the unregulated flows of communication and discourse in civil society lay siege to 
the political system, “without, however, intending to conquer it” (1992b [1996b: 487]) but ina 
manner than allows them to influence judgement and decisions in the political system. The model is 
supposed to explain how the embers of radical democracy can stay aflame within a modern 
bureaucratic state. Earlier models of popular sovereignty presuppose that society is “an association 
writ large” or a macro-subject with a sovereign will: citizens are actually authors of the very laws to 
which they must submit, and hence encounter these as an expression of, not a constraint on, their 
autonomy. But such a model would only work, if at all, in small-scale, ethically homogeneous 
societies, with a very high degree of popular participation, none of which applies to modern Western 
states (1992b [1996b: 102-3]). By contrast, Habermas’s discourse theory offers a picture where 
members of civil society can, through participation in discourse, help shape public opinion, which 
through the circulation of communicative power from periphery to centre can indirectly “program” or 
“counter-steer” the political system (1992b [1996b: 372; 332]) by means of laws and policies that are 
“in the equal interest of all” (1992b [1996b: 98; 154]). In this way, he claims, the political autonomy 
of legal persons is ensured since they 


can at the same time understand themselves as authors 
of the law to which they are subject as addressees. 
(1992b [1996b: 408 187]) 


William E. Scheuerman’s objection is pertinent here. On Habermas’s view, a law acts as a 
“transformer” between the communicative power circulating in civil society and the administrative 
power of the legal and political systems, and it has to do so if it is to facilitate social integration. 
However, this view is incompatible with Habermas’s earlier conception of system and lifeworld, 
according to which the former, which includes the legal and administrative system, remains a “block 
of more or less norm free sociality” (1981 [1987: 171]). And Scheuerman points out that, not only is it 
improbable that communicative power can be transformed into administrative power and counter- 
steer the political system, but Habermas gives no detail about where this interface lies— 
institutionally speaking—and how it operates. 


A more radical objection stems from direction of the Rochester School who consider that the very 
idea of a “common good” or “the equal interests of all” that might be served by law, is an illusion 
(Riker 1982). Other realist theorists of democracy, e.g., Luhmann, deny that public reasons flowing 
into the political system from civil society can and do steer political decisions (Luhmann 1969). 


4.2 The Co-Originality Thesis 


Habermas’s theory asserts what he calls a co-originality thesis, between various pairs of political 
ideas: between the system of rights and the principle of democracy; between private/individual and 
public/political/civic autonomy; between individual rights that secure the former, and popular 
sovereignty that is the expression of the latter. Co-originality means that both enjoy equal priority, 
and that neither is reducible to each other: they reciprocally call one another into being. The co- 
originality relations are revealed when the idea of self-legislation, namely “that the addressees of law 
are simultaneously authors of their rights” is decoded in discourse-theoretical terms (1992b [1996b: 
104, 314, 409]). 


The co-originality thesis has both architectonic significance for Habermas’s theory and substantive 
implications. The architectonic implication is that both the principle of democracy, and the system of 


rights are derived independently of principle (U). One substantive implication is, as Ingeborg Maus 
argues (Maus 1996 [2002: 90-98]), that 


the circular process in which the ... legal form, and ... 
the democratic principle—are co-originally 
constituted (1992b [1996b: 122]) 


indicates that basic rights are called into being by the democratic process and vice versa, in such a 
way that neither depends on, or is externally constrained by, an antecedently existing order of moral 
rights (pace Larmore 1995) or an ethical form of life (pace Bernstein 1996 [1998] & Michelman 
1998). The co-originality thesis thus expresses Habermas’s view of the autonomy of the political (and 
legal) domain from the moral, and the sui generis nature of political legitimacy. 


4.3 The Principle of Democracy 


The keystone of Habermas’s political theory is the principle of democracy that states: 


Only those statutes may claim legitimacy that can meet 
with the assent of all citizens in a discursive process of 
legislation that in turn has been legally constituted. 
(1992b [1996b: 110]) 


It also takes the form of a validity to consensus conditional, though democratic discourse is inclusive 
of ethical, moral, and pragmatic reasons, and indeed fair compromises, so that consensus is a messy 
and imperfect affair. This makes the process of reaching discursive agreement far more difficult, 
when one considers that many deeply held ethical values are limited to particular cultural 
communities, though this difficulty is supposed to be mitigated by the mediation of the legislative 
process. 


A central contention of Between Facts and Norms is that what Habermas calls the principle of 
democracy “derives” from the interpenetration of principle (D) and the legal form (1992b [1996b: 
122-3]). Recall that in phase two principle (D) is supposed to be a rule of practical argumentation in 
general that is neutral with respect to morality and law (1992b [1996b: 107]). In this respect, 
Habermas claims, as in respect of the circular process of reciprocal co-original constitution, the 
principle of democracy is “morally freestanding” (1992b [1996b: 80]). That is, the principle of 
democracy is derived completely independently of the moral principle (Finlayson 2019: 94). Here 
again Habermas insists on the autonomy of the democratic political process, and claims that the 
democratic procedure of the production of law is the sole source of its legitimacy, and he criticizes 
the views of Rawls, Dworkin, Larmore, and Apel, all of whom, in various different ways, claim that the 
legitimacy (or validity) of law is borrowed from that of morality. 


One difficulty facing Habermas is to square his claims that (a) democratic discourse is an amalgam of 
all three kinds of discourse, and (b) that political legitimacy is sui generis and both ethically and 
“morally freestanding”, with (c) the absolute priority of morality in all spheres. He claims that 
legitimate laws “must harmonize with the universal principles of justice” (1992b [1996b: 99, 155]) 
and that legitimate laws must not “contradict basic moral principles” (1992b [1996b: 106]), by which 
he means valid moral norms. He places a moral permissibility constraint on political legitimacy, in 
such a way that it appears that morality constrains political legitimacy from the outside (Finlayson 
2016). Habermas does not, though, see this as an “external” constraint, since he argues that morality 
flows into the political and legal domain through the constitutional role of basic rights, and circulates 
within it. Nevertheless, the moral permissibility constraint smudges the bright line that Habermas 
likes to draw between what he calls natural law theories of legitimacy, which are based on an 
antecedent morality, and discourse theory, which is not. 


4.4 The System of Rights 


Habermas argues that, alongside the principle of democracy, what he calls a “logical genesis of 
rights” arises from the “interpenetration” of the legal form and the discourse principle (D) (1992b 
[1996b: 121]). The argument is hard to follow. It begins from the premises of (D) and the form of 
modern law, and assumes that the idea of legitimate law presupposes that of a legal subject qua 
bearer of rights, no matter what the content of those specific rights is. The conclusion to the 
argument is a system of rights, of five different kinds. 


Basic rights to the greatest possible measure of equal individual liberties. 
Basic rights to membership in a voluntary association of consociates under law. 


Basic rights to the actionability of rights arising from the legal protection of rights-holders. 


Pe 


Basic rights to the equal opportunity to participate in the processes of political will 
formation and the production of legitimate law. 


5. Basic rights to living conditions that are socially, technologically, and ecologically 


safeguarded, insofar as this is necessary for citizens to exercise their civil rights 1-4 (1996b: 
123-4). 


The first three rights are supposed to arise theoretically from the application of the discourse 
principle to the form of law. These are rights that citizens must grant to one another if they are 
“legitimately to regulate their living together by means of positive law” (1992b [1996b: 126; 82; 
118]). The next two—political and social rights—are practical and material enabling conditions that 
ensure the effectiveness of the first three rights. The first three rights, Habermas claims, are not 
specific rights, but what he calls “unsaturated placeholders” for specific rights that have to “be 
interpreted and given concrete shape” by actual citizens in response to determinate historical 
conditions (1992b [1996b: 125-6]). This is crucial to Habermas’s theory, because it purports to 
reconstruct the ability of citizens, from their perspective, to reciprocally grant one another the rights 
necessary for their common existence as consociates under law. That’s why he claims that he, unlike 
Rawls, doesn’t design “the basic norms of a well-ordered society on the drafting table”, and then 
apply them to society (1990d [1994: 101]). In that sense, just as discourse ethics leaves the validation 
of moral norms to participants in discourse, the discourse theory of law and democracy has to leave 
the political process of establishing a system of rights up to citizens themselves as much as possible. 
This is the sense in which Habermas claims the discourse theory of democratic legitimacy is “strictly 
procedural” and more modest than “normative political theory” a la Rawls (Habermas 1995: 117 & 
132; Rawls 1995: 175-177). For all that, unlike in discourse ethics where neither (U) nor (D) have the 
status of valid moral norms, Habermas nonetheless derives a system of rights that for all the world 
resembles T. H. Marshall’s Whiggish account of civic, political, and social rights, in his classic work 
of political sociology (Marshall 1950). 


4.5 Objections to Between Facts and Norms 


Joshua Cohen objects that the principle of discourse does not amount to a requirement of equal 
liberty, and that nothing so rich as Habermas’s scheme of individual liberties follows solely from the 
application of the discourse principle to the legal form (Cohen 1999: 393, 398). He objects even while 
acknowledging that the various rights are not yet saturated: they are not yet specific, historically and 
socially determinate rights. But contra Cohen, on Habermas’s account, legal form, or modern “form 
of law” is a richer idea than the mere rule of law, and refers to a complex of features that law has ina 
modern constitutional democratic state. As Baynes and Zurn point out, Habermas’s theory 
reconstructs the way that, via the discourse principle, the form of law in modern—that is, post- 
traditional and post-conventional—societies functions to compensate for the loss of shared traditions, 
and relieves the burden on citizens to reach reasoned agreement with one another and thereby 
coordinate their actions (Baynes 2016: 166: Zurn 2011). 


Some critics argue that Habermas is wrong to look for a justification of basic rights that is 
functionalist, or merely “internal to law”, one that sees them only as necessary conditions for the 
institutionalization of the democratic process, or one that is strongly constructivist, that begins from 
slender premises that eschew moral or ethical considerations (Forst 2011; Larmore 1995; Michelman 
1998; Bernstein 1996; and Cohen 1999; cf. Flynn 2003). The upshot of such criticisms is that 
Habermas’s justification of the system of rights requires stronger normative support of one kind or 
another, and that political legitimacy is not entirely sui generis. 


Rawls, Cohen, and Larmore argue in addition that Habermas’s political theory rests on what Rawls 
calls a “comprehensive doctrine” since it is based on a controversial theory of meaning and 
communication and a controversial doctrine of method (Rawls 1995: 139; Cohen 1999; Larmore 
1995). However, there is an important difference between comprehensive philosophical doctrines 
and comprehensive moral, ethical or religious doctrines. The fact that a normative political theory 
has controversial philosophical assumptions, which almost all do, does not create the kind of 
practical problems that arise when a political system, or constitution, is saddled with controversial 
moral or religious assumptions, and its citizens cannot regard it as legitimate (Lister 2007). To make 
that claim is to presuppose that political theory answers to the same canons of justification as 
political systems (Laden 2010). 


5. Methodology and Philosophical 
Framework 


In the transitional period of the 1970s when Habermas began his communicative turn, he developed 
various ideas about method that came to shape his mature work: for example, rational reconstruction 
(§5.1) as a method for critical social theory, postmetaphysical thinking (§5.2) as a framework for 
philosophy, and a set of related views about the proper role of philosophy. 


5.1 Rational Reconstruction 


Rational reconstruction is the method, and the label, for a cluster of methodological assumptions 
shaping the major philosophical projects of Habermas’s middle period: the theory of communicative 
action, discourse ethics, and the discourse theory of law and democracy. He originally developed it as 
part of an attempt to explain social phenomena and to recalibrate critical social theory on the basis of 
formal pragmatics. 


Rational reconstruction is an approach that Habermas developed on the model of Noam Chomsky’s 
universal grammar (1976b [1998b 1: 35]), Jean Piaget’s developmental psychology, and Lawrence 
Kohlberg’s moral psychology (1983 [1990: 33-41]). These are theories that reconstruct universal 
human capacities—for language acquisition, cognitive development, and moral reasoning, 
respectively. Habermas’s use of rational reconstruction aims to set out the structures, rules, and 
competences underlying lifeworld practices. The targets of the method may also be described as the 
idealizing, counterfactual commitments which participants in a practice must make, in order for the 
practice to be meaningful or rational for them (1999a [2003a: 85-6]; 2005b chapter 3 [2008a: 81-4]). 
To rationally reconstruct a practice is to turn the implicit “know how” of participants into explicit 
“know that” (1976b [1998b: 33, 34-5]). For example, rationally reconstructing the everyday practice 
of communication gives access not to the semantic content of the speaker’s particular utterances, 
which is already explicitly known, but the implicitly known rules which the speaker follows in 
successfully communicating (1976b [1998b: 33]). Habermas calls this “illocutionary” or “pragmatic” 
meaning. 


These underlying structures are 


brought to consciousness through the choice of suitable 
examples and counterexamples, through contrast and 
similarity relations, through translation, paraphrase 
and so on—that is, through a well thought out, maieutic 
method of interrogation. (1976b [1998b: 40]) 


They are revealed not as timeless constants, but as they have developed over time, with their internal 
developmental logics (Pedersen 2008: 463, 474-9). Habermas originally claimed that rational 
reconstruction uncovers knowledge of universal human capabilities, “species competences”, rather 
than the competences of particular groups or individuals (1976b [1998b: 34-5]; McCarthy 1991: 130- 
2). For example, rationally reconstructing the practice of everyday speech uncovers the rules of 
communicative action as such, not the grammar of a particular language. However, Habermas’s later 
description of the discourse theory of law and democracy as a rational reconstruction of “the self- 
understanding of modern legal orders” of democratic constitutional states (1992b [1996b: 82], 
emphasis removed), evidently a local phenomenon, suggests that he has since modified the scope of 
the reconstructive method. Commentators are dividend on this point, with some distinguishing 
between an “empirical” variety of reconstruction on display in the Theory of Communicative Action, 
and a “normative” variety in Between Facts and Norms (Peters 1994: 119), and others arguing that 
the same methodology underlies both projects (Patberg 2014: 511-3; however, see Gaus 2013: 561). 
This tension is partly resolved if we remember that democratic law-making draws on general 
communicative competencies, and makes use of pragmatic, ethical, and moral discourses. The 
phenomenon is local, but the capacities involved are general. 


In terms of his own work, formal pragmatics rationally reconstructs the communicative capacities 
possessed by all human beings, making them explicit in Habermas’s accounts of communicative 
action and the rules for redeeming validity claims (1976b [1998b: 22-4]). The discourse theory of 
morality does this for our capacity for engaging in moral discourse, formalizing this in the (D) and (U) 
principles (1983 [1990a: chapter 2: 37, chapter 4: 174-5]), while the discourse theory of law and 
democracy does the same for the practice of lawmaking in democratic constitutional states, 
formalizing this in the system of rights and the principle of democracy (1992b [1996b: 110-1, 118- 
24, 287]). Importantly, Habermas claims that the theories produced by these processes of rational 
reconstruction have the status of falsifiable hypotheses—they are not a priori since they are not 
necessary claims, although they are supposed to be “universal” in the sense that they are, at present, 
without alternatives. Habermas sometimes refers to them having a “weakly transcendental” status. 


Whether or not formal pragmatics accurately describes the practice of human communication can 
only be decided a posteriori, by the future “success” or “failure” of the theory as an input in further 
empirical investigations (1976b [1998b: 39]; 1983 [1990: 32]), with Habermas suggesting coherence 
between theories as the criterion of success (1981 [1987: 399-400]; 1983 [1990a: chapter 2, 39]). 
Jorgen Pedersen has argued that it is still not clear what constitutes success and failure in this 
context, and thus not fully clear how rationally reconstructed theories can be tested (Pedersen 2008: 
478-1). Karl-Otto Apel, similarly, questions what it would mean to falsify the “unavoidable 
presuppositions of argumentation” itself (Apel 2002: 19). 


Habermas claims that knowledge of the underlying structures and competences acquired through 
rational reconstruction can then be used for the purposes of social critique. Supposedly, a version of 
a practice can be identified as pathological, or not fully rationalized, if it does not meet the 
counterfactual idealization which the practice presupposes (1983 [1990: 31-2]). Habermas thus 
identifies systematically distorted communication, invalid moral norms, and illegitimate laws as 
deficits in communicative action, moral discourse, and democratic lawmaking, respectively. Rational 
reconstruction allows us to critique these actual practices according to their own internal standards, 
rather than the critic’s arbitrarily chosen standards or the philosopher’s supposedly transcendental 
standards (1992b [1996b: 5]). Similarly, the developmental logic revealed by rational reconstruction 
can be used to evaluate processes of historical development as progressive examples of collective 
learning, or as regressive and pathological. As a method for explaining social phenomena, rational 
reconstruction is neither merely empirical nor hermeneutic; the principles it produces are supposed 
to ground a kind of social and political theory which is neither “ideal” nor “real”, but somewhere in 
between. Habermas’s claim that philosophy can find within traditions construed as learning 


processes a “standpoint of critical evaluation” (1996a essay 3 [1998a: 97]; 1992b [1996b: 5]) is 
certainly in the spirit of critical theory, but arguably in tension with his other claims that that 
philosophy should “limit itself to the clarification of the moral point of view and the procedure of 
democratic legitimation” (Habermas 1995: 131; Finlayson 2019: 205). 


5.2 Postmetaphysical Thinking 


Postmetaphysical thinking, Habermas’s paradigm for modern philosophy, must be understood 
through contrast with its predecessor, metaphysics. He labels much of the history of Western 
philosophy as “metaphysics”, counting Parmenides, Plato, Plotinus, Augustin, Aquinas, Spinoza, 
Leibniz, and Hegel as metaphysical thinkers (1988b essay 2 [1992a: 12-13]; 1988b essay 3 [1992a: 
29]). He sometimes distinguishes between metaphysics proper and the “philosophy of consciousness” 
or “philosophy of the subject” associated with the rationalism of Descartes and the idealism of Kant, 
Fichte, and Schelling (1988b essay 3 [1992a: 31]; 1988b essay 8 [1992a:, 158-62]), though it is not 
clear whether this should be considered a separate paradigm (1988b essay 2 [1992a: 12-13]) or 
simply a late stage of metaphysics. 


Among the characteristics of metaphysics are: 


e A conception of philosophy as the queen of the sciences, with its own unique method and 
form of knowledge, distinct from the natural and social sciences, which can yield special 
insights into the nature of reality and the meaning of life. Plato’s theory of forms is an 
excellent example of this, since Plato thinks that philosophy, with its dialectical method, can 
offer true knowledge (episteme) of the forms, superior to mere opinion (doxa) about the 
material world. 

e A substantive conception of reason as an Archimedean point from which the philosopher can 
observe reality as a whole. The metaphysical philosopher’s goal is to attain an observer’s 
perspective on reality, from which they can learn universal and necessary truths. Philosophy 
can thus act as the judge and arbiter of both science and culture (1983 [1990: 2-3]). 

e Idealism and identity thinking (1988b essay 3 [1992a: 29-31]). Metaphysics assumes that 
ideas are primary and the material world secondary, and that to grasp the underlying 
intellectual reality is to grasp the whole: “the structures of being themselves are what is laid 
hold of in knowledge” (1988b essay 2 [1992a: 13). Again, Plato’s theory of forms is a prime 
example. 

e For philosophy of consciousness, the use of strong transcendental arguments to ground 
claims about the nature of reality on the individual subject’s self-knowledge. Habermas 
thinks that this turn to introspective self-knowledge as foundational took place as a result of 
the pressure which natural science was putting on metaphysics by the eighteenth century. It 
may no longer be plausible that the individual subject can grasp “the structures of being 
themselves” in thought, but they can at least grasp the structure of their own thoughts, and 
build on these foundations some certainty about the world. Kant’s transcendental unity of 
apperception (1988b essay 7 [1992a: 124-5]) and refutation of idealism in the First Critique 
are the clearest examples. 


Habermas concedes that this is a stipulative definition of metaphysical philosophy, focused on 
idealism. Ancient materialism and scepticism, medieval nominalism, and modern empiricism do not 
fit the mould, but Habermas argues that they should be seen as “antimetaphysical 
countermovements” within the horizon of metaphysics (1988b essay 3 [1992a: 29]). He often 
characterises modern philosophical trends of which he is critical as covertly metaphysical. Habermas 
sees some trends as attempted breaks with metaphysics which remain trapped within the paradigm 
(Nietzsche, Heidegger, Derrida—1981 [1987: 83-105, 131-160, 161-184]), others as still being mired 
in the philosophy of consciousness (Niklas Luhmann’s systems theory—1981 [1987: 368-385]), and 
other again as deliberate attempts to return to the philosophy of consciousness (Dieter Henrich— 
1988b essay 2 [1992a: 10-27]). 


Postmetaphysical thinking begins with the first generation of post-Hegelian thinkers (Feuerbach, 
Marx, Kierkegaard) (1988b essay 3 [1992a: 39]), and includes pragmatists such as C.S. Peirce and 
G.H. Mead, speech-act theorists such as J.L. Austin and John Searle, the later Wittgenstein, and Karl- 
Otto Apel. 


Among the characteristics of postmetaphysical thinking are: 


Rational reconstruction as a method (1988b essay 3 [1998b: 38]). Since rational 
reconstruction is also used by many of the social sciences, philosophy is no longer seen as 
unique in its method and form of knowledge, but simply as one discipline among others. 
Postmetaphysical philosophy has no priority over the natural and social sciences, but neither 
is it subordinate to them. It opposes scientism, the idea that the natural sciences have 
unique authority and privileged access to the truth. 


A “weak but not defeatistic (sic) concept of linguistically embodied reason” (1988b essay 7 
[1992a: 142]). Postmetaphysical philosophy conceives of reason as being historically, 
socially, and linguistically situated, embedded in the intersubjective communicative 
processes of the lifeworld, rather than transcending them. The philosopher is a participant 
in these processes, not an outside observer. Postmetaphysical reason is immanent in the 
communicative practices of the lifeworld, and procedural (1988b essay 3 [1992a: 34-5]), 
rather than being the external Archimedean point of metaphysics. Habermas thus sees 
postmetaphysical thinking as having a detranscendentalized conception of reason. 


The use of weak transcendental arguments, rather than the strong transcendental 
arguments of metaphysics (Yates 2011: 41-4). Benhabib defines strong transcendental 
arguments as ones which aim to 


(prove) the necessity and singularity of certain 
conditions without which some aspect of our 
world, conduct, and consciousness could not be 
what it is. 


Descartes’ cogito and Kant’s refutation of idealism are examples, aiming to prove the 
existence of the subject and of the external world. Weak transcendental arguments, in 
contrast, focus on lifeworld practices such as communicative action, moral discourse, and 
democratic deliberation (rather than experience as such), and 


demonstrate more modestly that certain 
conditions need to be fulfilled for us to judge 
those practices to be of a certain sort rather than 
of a different kind. (Benhabib 2002: 38) 


Unlike strong transcendental arguments, weak transcendental arguments are both a 
posteriori, since they are based on rationally reconstructed experience, and falsifiable, since 
they can be refuted by further empirical experience (1976b [1998b: 42]). 


The ability to provide context-transcending validity, despite abandoning metaphysical 
philosophy’s search for universal necessary truths. Validity claims are the clearest example 
(1988b essay 7 [1992a: 142]; 1983 [1990: 19, 203]). Although they are always advanced 
from within a particular lifeworld context, they hold that a certain thing is morally right for 
the intersubjective world, or true for the objective world, as a whole. The claims raised are 
not just valid within the context of the interlocutors raising them, but for all speaking and 
acting subjects, as they must be if postmetaphysical philosophy is to critique existing social 
and political conditions (1988b essay 3 [1992a: 50]). Habermas speaks in this context of the 
‘immanent transcendence” or “transcendence from within” of language (2005c). 


e 
Modesty with regard to ethical and ontological claims (Rees 2018: 55-6, 60-2). Although it 


can clarify the procedures of moral discourse, it should refrain from making substantive 
contributions to ethical discourse (1988b essay 3 [1992a: 50]; 1988d [1990a chapter 5, 


211]). Unlike metaphysical philosophy, it refrains from making ontological claims about 
matters such as the existence of God. In accordance with the postsecular orientation of 
Habermas’s recent work, postmetaphysical philosophy remains agnostic about such matters, 
while being open to the truth-contents of religious language. 


The metaphysical philosopher is an isolated figure, aiming through the use of their individual reason 
to attain a neutral observer’s perspective on reality as a whole and produce knowledge of universal, 
necessary truths. The postmetaphysical philosopher, in contrast, aims at a participant’s perspective, 
working in dialogue with the natural and social sciences and making use of a situated, procedural 
conception of reason. There is no transcendent Archimedean point for the philosopher to occupy, 
since we are all embedded in our lifeworlds and our thinking conditioned by social, historical, and 
linguistic factors. What remains distinctive about philosophy as a discipline is that it can step back 
from the particular questions which the natural and social sciences focus on and produce general 
knowledge about the human condition; but this knowledge is now based on empirical data and 
rational reconstructions, and as such is contestable and revisable. Postmetaphysical philosophy no 
longer claims to be the role of judge and usher of science and culture, assigning each discipline and 
cultural practice to its proper place. Instead it acts as a placeholder for fruitful collaborations 
between empirical research and philosophical ideas, and as an interpreter mediating between the 
rationalized value-spheres of science, law, and art and the everyday discourse of the lifeworld (1983 
[1990: 15-9]). 


6. Constitutional Patriotism, 
Cosmopolitanism, and International 
Law 


Habermas’s views on national identity, the nation state, and global politics have been shaped by the 
historical experience of living through the Third Reich, the division of Germany into East and West, 
reunification, and the development of the European Union. The key to his applied political theory 


consists in differentiating between the three elements 
of statehood, democratic constitution, and civic 
solidarity, which are closely linked in the historical 
form of the constitutional state. (2009 chapter 7: 112) 


Habermas argues that the conjunction of these elements is contingent, not necessary, and that in the 
“postnational constellation” of the late twentieth and early twenty-first centuries they should be 
disaggregated. His theory of constitutional patriotism argues that civic solidarity need not be 
reduced to national identity, but can rather be generated by the process of constitution-making itself 
(§6.1). With regard to the international arena, he argues that the constitutionalization of international 
law can proceed without a global state, and, as such, cosmopolitans should aim at a “politically- 
constituted world society” (2004 chapter 8 [2006c: 161]) rather than a world republic (§6.2). 


6.1 Constitutional Patriotism 


The term “constitutional patriotism” (Verfassungspatriotismus) was not coined by Habermas, but by 
the political scientist Dolf Sternberger, who popularized it in a 1979 article marking the thirtieth 
anniversary of the Federal Republic of Germany (Muller 2006). Habermas took up the term and 
developed his own distinct interpretation of it beginning with the “historians” dispute’ of 1986. 
Constitutional patriotism is the theory that in modern states the constitution can, and should, take 
the place of the nation as focus of citizens’ feelings of collective identity and the source of their civic 
solidarity. While originally concerned with specifically post-war West German questions about history 
and national identity (1988a), and thus drawing on a rational reconstruction of the postnational form 
of identification developed in the divided Germany, Habermas later applied the theory to modern 
states more generally, and to the European Union in particular (1998c). 


National identity, for Habermas, is a product of the modern era, dating from the time of the French 
Revolution and the later Romantic movement. It was constructed by linguists, historians, and writers 
and propagated through the education system and the public sphere. As such, the nation is 
intermediate between traditional and post-traditional forms of identity (1988a: 5-7; 1992b [1996b: 
494-5]). In a traditional society, collective identity is accepted unreflexively, as are the society’s 
conventional morality and worldviews. It is a supposedly natural, pre-political given. Collective 
identity in post-traditional society, in contrast, is adopted in a reflexive manner, in light of reasons 
given in the public sphere. While in reality post-traditional, the nation is reified as traditional and 
quasi-natural, and projected into a distant past. From the beginning, the nation played a political role 
in generating civic solidarity among strangers. National identity establishes an abstract level of 
solidarity, transcending the face-to-face associations which bind people in traditional society: 
villages, clans, and localities. The feeling that they belong to the same nation motivates individuals to 
make sacrifices for others who they have never met, whether in terms of redistributive taxation or 


military service (1998c [2001a: 64-5]). This was crucial for the viability of democratic republics at 
the transition from traditional to post-traditional society. Nationhood provides the cultural substrate 
for the “nation of citizens” who govern themselves democratically (1996a essay 4 [1998a: 117-8]). 
But, for a number of reasons, the nation is less and less able to play this role today. For one, all 
contemporary nation states are increasingly multi-ethnic and multicultural, undermining the idea 
that a purportedly homogeneous collective identity can act as the substrate for democracy. For 
another, the dangers of extreme ethnonationalism are all too obvious: they generate in-group/out- 
group distinctions that can lead to discrimination, racism, and at the limit to ethnic cleansing and 
genocide. Finally, nation states themselves are increasingly powerless in the face of factors outside 
their control, such as global capitalism and climate change. This “postnational constellation” requires 
a different form of social integration (1998c [2001a: 58-112)]). 


Luckily, according to Habermas, the relationship between constitutional democracy and the nation is 
historically contingent (1992b [1996b: 495]; 1996a essay 5 [1998a: 132-3]; 1998c [2001a: 76]). A 
modern society does not need anything as substantial as a shared religion, way of life, or repository 
of values to serve as the basis of collective identity. A democratic political system can generate its 
own civic solidarity. Supposing that individuals have progressed from a traditional to a post- 
traditional level of identity and are willing to adopt a more reflexive understanding of their identity, 
the constitution can replace the nation as a source of civic solidarity and attachment. The democratic 
procedure of constitution-making not only produces legitimacy, as described in the discourse theory 
of law and democracy, but can also serve as the basis of belonging. The goal of constitutional 
patriotism is not, then, to eliminate national identity, but rather to decentre it and deprive it of its 
political function. In concrete terms, constitutional patriotism involves citizens developing critical 
and reflexive loyalties and attachments to their country’s constitution and the moral principles 
encoded therein. What results from this is a collective identity with a political function. Citizens’ 
status as joint makers and interpreters of the same constitution takes the place of their shared 
natality and status as co-nationals. This view is constitutional in that it revolves around the work of 
making, interpreting, and reflecting on the constitution, which takes place in the public sphere. It 

is patriotic in that it has a binding effect on the community of citizens, furnishing them with civic 
solidarity and a collective identity. The formation of constitutional-patriotic identity, significantly, 
takes place at the level of opinion- and will-formation in the public sphere. Unlike supposedly natural 
and pre-political national identity, it is formed in the light of rational discourse (2004 chapter 6 
[2006c: 76-9]). 


Critics of constitutional patriotism argue that it is either too “thin” to play the role Habermas assigns 
to it, or too “thick” to really be postnational and open to all, as he claims (Hayward 2007: 186-9). The 
first criticism is that the universal moral principles encoded in constitutions are too abstract and 
affectively thin to generate a sense of personal loyalty and collective belonging among citizens, 
compared to the thick bonds of national identity (1996a essay 5 [1998a: 132]; Canovan 2000). 
Habermas rejects this criticism. Citizens interpret the principles found in their constitution—which 
represent universal moral and political norms of democracy and human rights, and might be found in 
any liberal-democratic constitution—in the light of their community’s unique historical experience. 
They internalize these principles, not abstractly, but in the context of the history of their country 
(1996a essay 8 [1998a: 225-6]). Constitutional principles become part of the “dense web” of a 
society’s (and an individual’s) historical experiences and pre-political values (2004 chapter 6 [2006c: 
77-8]). It follows that each country’s constitutional patriotism will be different, inflected by the 
particular past that country has worked through—Habermas notes that French constitutional 
patriotism, marked by a tradition of revolutionary democracy, will be different to German 
constitutional patriotism, marked by the historical failure to produce a working democracy (1992h: 
240-1). Far from being an abstract, bloodless construct as critics have alleged, Habermas sees 
constitutional patriotism as intimately connected to each community’s particular history and culture, 
and to its concerns about identity and the common good—its ethical-political self-understanding. 


The same universalistic content must each time be 
appropriated from out of one’s own specific historical 
life-situation, and become anchored in one’s own 
cultural form of life. Every collective identity, even a 
post-national one, is much more concrete than the 
ensemble of moral, legal and political principles around 
which it crystallizes. (1992h: 241) 


The second criticism is that supposedly constitutional-patriotic identity is, in reality, the thick identity 
of the dominant majority culture, albeit disguised (Laborde 2002: 593-9). Habermas’s argument that 
constitutional principles are interpreted in the light of particular histories plays into this criticism. If 
a particular country’s constitutional patriotism is so closely bound up with its ethical-political self- 
understanding, then the language, culture, beliefs, and values of that country’s majority culture will 
make a deep impression on it. Constitutional patriotism consequently collapses into civic nationalism. 
At the same time, Habermas’s claim that the connection between national identity and constitutional 
democracy is merely contingent holds open the possibility that, even if they have overlapped in the 
past and the present, an ongoing learning process may lead to the two being fully decoupled in the 
future (1998c [2001a: 101-2]). A further question, articulated by Andrea Baumeister among others, is 
whether the liberal-democratic values which are to be interpreted in the light of national histories 
are, in fact, as widely accepted as Habermas believes (Baumeister 2007: 491-5). If not, constitutional 


patriotism may be more particularistic and Eurocentric than it at first appears, raising questions 
about how those who do not adhere to such values are to be integrated into constitutional-patriotic 
polities. 


Habermas’s longstanding support for a European constitution (2001c [2006g]; 1996a essay 6 
[1998a]) can be explained by his hope that a European constitutional patriotism would provide the 
civic solidarity that would allow the transnational polity of the European Union to fulfil its democratic 
potential (2004 chapter 6 [2006c]). 


6.2 Cosmopolitanism and the 
Constitutionalization of International Law 


Habermas’s views on international politics are characterised by a revision of Kant’s cosmopolitanism, 
and a total rejection of Carl Schmitt’s conception of the political as irreducibly antagonistic (1996a 
essay 7 [1998a: 193-201]; 2004 chapter 8 [2006c: 188-93]). After some early explorations (1996a 
essay 7 [1998a]), he rejects a Kantian version of cosmopolitanism, arguing for a “politically 
constituted world society” (2004 chapter 8 [2006c: 161]) rather than a world republic or a global 
federation. His focus is the “constitutionalization of international law” (2004 chapter 8 [2006c: 132- 
5]) without a world government, with the limited aims of ensuring peace and protecting human 
rights, rather than democracy on a global scale. Habermas pays particular attention to the different 
types of legitimation which different levels of global governance would require, and the different 
types of constitution they would need. 


In Perpetual Peace, Kant looks forward to the transformation of international law, with states as its 
subjects, into cosmopolitan law, with global citizens as its subjects. Habermas, in contrast, argues 
that cosmopolitan law must be dualistic, with both states and individuals as its subjects. The 
cosmopolitan community is dualistic in the sense that it is a community of both human beings 
(individual subjects), and states (collective subjects) (2004 chapter 8 [2006c: 135]; 2005b chapter 11 
[2008a: 317]). Both can act as the founding subjects of a world constitution (2009 chapter 7: 119; 
2011 [2012: 58)]). 


Habermas describes “a multilevel political system that does not assume a state-like character as a 
whole” (2004 chapter 8 [2006c: 144]), divided into three levels (2005b chapter 11 [2008a: 322-7]), 
each with its own form of legitimation and type of constitution: 


e National: nation states (2009 chapter 7: 115-6). Although they are no longer fully sovereign 
in a globalized world, states still have the monopoly on the use of force within their 
territories. They are based on a defined self-legislating demos (“national” or otherwise), 
which gives itself laws by following democratic procedures, thus linking the rule of law to 
democracy. In other words, states have a more “republican” type of constitution, which 
gives a central role to popular sovereignty established in a revolutionary moment (1998d 
[2001a: 116-8]; 2004 chapter 8 [2006c: 122]). They can have full democratic legitimacy, as 
described in Between Facts and Norms, since within a state the authors of law can also be 
its addressees (2004 chapter 8 [2006c: 141]). States have the highest requirements of 
legitimacy within Habermas’s model. They have two functions: firstly, to provide military 
force for implementing human rights and policy decisions of the higher levels (2005b 
chapter 11 [2008a: 320-321]), and secondly, to generate indirect political legitimacy for the 
higher levels. 

e Transnational: continental blocs like the EU, Habermas’s prime example of a transnational 
polity, along with its less-integrated siblings such as ASEAN and Mercosur (2005b chapter 
11 [2008a: 325-6]). Great powers such as the USA, China, Russia, and India also operate at 
the transnational level (2009 chapter 7: 114), as do economic organizations such as the 
WTO, IMF, and World Bank; and UN agencies such as the WHO and UNESCO (2011 [2012: 
56]). This is the most “pluralist” level in Habermas’s model, since it contains very different 
types of polities, and includes ones which are both liberal and illiberal, democratic and non- 
democratic. At the transnational level there exists a “global domestic politics” (2004 chapter 
8 [2006c: 136, 160]) addressing socioeconomic questions. Issues of wealth and 
redistribution, health and disease, trade, migration, and environmental policy can be 
discussed within and between transnational polities. Habermas still considers relations in 
and between them to count as international relations or foreign policy, although recourse to 
war should be ruled out (2005b chapter 11 [2008a: 325]). The normative yardstick for 
relations between transnational polities is fair negotiation, not the full democratic legitimacy 
which can exist within states (2009 chapter 7: 125-6; 2011 [2012: 57, 68]). They are 
nonetheless open to the influence of deliberative publics from below, and should 
institutionalize some degree of citizen participation, via referenda or mechanisms for 
responding to transnational public spheres. Transnational polities also receive some indirect 
legitimation in virtue of their member states being legitimate (2004 chapter 8 [2006c: 142]). 
They thus have middling requirements of legitimacy, which can be generated in a number of 
direct and indirect ways, and may have several different types of constitutions. 

e Supranational: global organizations with universal membership, comprising both 
individuals and states as members. Habermas envisages a reformed United Nations as the 


central component of the supranational layer, along with a stronger version of the 
International Criminal Court (2004 chapter 8 [2006c: 133-4, 173-4]). Their remit is strictly 
to secure peace and prevent human rights violations (2004 chapter 8 [2006c: 136]; 2005b 
chapter 11 [2008a: 322]; 2011 [2012: 60-1]), with the security council performing an 
executive function, the ICC a judicial function, and the UN Charter acting as a supranational 
constitution (2004 chapter 8 [2006c: 160-1]; 2009 chapter 7: 120). A reformed General 
Assembly could function as a world parliament, containing representatives of both states 
and cosmopolitan citizens (2009 chapter 7: 120-1), with its deliberations focused on 
interpreting and elaborating the meaning of the Charter, rather than the kind of political 
will-formation which takes place within national parliaments (2011 [2012: 60-1, 65]). All 
other cross-border political issues should be dealt with at the transnational level—the 
supranational level is the preserve of law, rather than politics (2005b chapter 11 [2008a: 
333-4, 343]; 2011 [2012: 65]). 


This division of labour has drawn criticism, with Cristina Lafont arguing that the relegation of all 
socioeconomic issues to the transnational layer ensures that human rights violations stemming from 
global inequality cannot be addressed (Lafont 2008). Habermas’s suggestion that the “slender but 
robust” cross-cultural consensus on basic human rights is enough to legitimate the supranational 
level’s policies (2004 chapter 8 [2006c: 143]; 2005b chapter 11 [2008a: 343-4]) has also been 
criticized. There is room for doubt on this point, especially with regard to contentious cases such as 
LGBT rights (1998d [2001a: 113-129]; Scheuerman 2008: 143-5). 


Since it is neither democratic nor a state, the supranational level has the lowest requirements of 
legitimacy (2004 chapter 8 [2006c: 133-4, 143]; 2011 [2012: 65]), and is not suited to a republican 
type of constitution, contra Kant. It is suited to a liberal type of constitution, which prioritizes the 
rule of law rather than popular sovereignty (2004 chapter 8 [2006c: 137-9]). A liberal constitution 
constrains established power in accordance with human rights, but does not connect it to the will of a 
self-legislating demos, which is in any case lacking at the global level (2005b chapter 11 [2008a: 
316]). Instead, the supranational level derives its legitimacy directly from the negative duties which it 
enforces (to prevent human rights abuses and wars of aggression), and indirectly from the legitimacy 
of the states which comprise it (2004 chapter 8 [2006c: 140-1, 143]; 2005b chapter 11 [2008a: 342- 
4]). This may be supplemented by the periodic emergence of a global public sphere, mobilised in 
opposition to wars of aggression or in condemnation of gross human rights violations (2004 chapter 8 
[2006c: 142]; 2005b chapter 11 [2008a: 343-4]; 2009 chapter 7: 124-5). 


In this model, supranational law is to have primacy over state law, in much the same way that EU law 
has primacy over the laws of member states (2004 chapter 8 [2006c: 137]). The plausibility of 
Habermas’s proposals depends on two learning processes. Individuals must learn to think and act as 
both national and cosmopolitan citizens, switching between a perspective that centres national 
interest and one that centres universal standards of justice (2009 chapter 7: 116-8), while states 
must learn to regard themselves as members of an international community, not absolute sovereigns 
(2011 [2012: 61]). Habermas regards the constitutionalization of international law as the “legal 
domestication of the intensified cooperation between states” (2014: 8), by which the Hobbesian state 
of nature at the global level can be gradually regulated and subject to law without the need for a 
global government, which could not in any case be fully legitimate. Some degree of global 
governance is unavoidable in the contemporary postnational constellation. What is crucial is that it 
should be constitutional and grounded in universal moral principles, rather than a technocracy at the 
service of neoliberal capitalism. 


7. Religion and Postsecularism 


Habermas’s views about religion and its place in modern society have changed strikingly over the 
course of his career. In a series of texts written mostly after 2001 he revises the secularist bent of his 
earlier social and political theory, as expressed in Theory of Communicative Action and Between 
Facts and Norms, so as to acknowledge religion’s close relation to philosophy and the central place 
of religious believers in democratic states. 


Influenced by Weber and Durkheim, Habermas had earlier characterised religious beliefs as the 
worldviews of traditional societies that in the course of rationalisation are superseded and replaced 
by secular forms. In religious belief the validity claims of objective truth, moral rightness, and 
sincerity, and along with them the objective, intersubjective, and subjective world-relations, were 
fused together. Speakers could not thematize and contest them (1981 [1984a: 214]; 1981 [1987: 
189]). This fusion, maintained by the strict segregation of sacred from profane domains of life, 
enabled a normative consensus to crystallize by non-discursive means (1981 [1987: 54]). That which 
was in accord with society’s ritually protected normative consensus was right, that which violated it 
was wrong, and the consensus itself was beyond questioning. The transition to modernity begins with 
the “linguistification of the sacred”, in which “the authority of the holy is gradually replaced by the 
authority of an achieved consensus” (1981 [1987: 77, see also 288]). Normative consensus and social 
integration are achieved in modern societies through communicative action, carried out by 
competent speaking and acting subjects who have mastered all three validity claims and world- 
relations (1981 [1987: 107, 145]). The sacred, the lynchpin of religious worldviews, has dissolved into 
unrestricted discourse, in which any validity claim may be contested. Religious belief may continue to 
exist, but it is now one worldview among many, and like every other social practice it must be 
continued by means of communicative action (1981 [1987: 88-9]). 


Habermas at this stage saw post-traditional society as secular (1992b [1996b: 443-4]). He 
subsequently rejected this view, stating that 


My earlier Hegelian view of religion as a formation 
destined to be dialectically superseded in the modern 
world has indeed changed. The empirical evidence of 
the survival of religion under modern conditions has 
accumulated in recent decades. (2012 chapter 6 [2017: 
143]) 


His most recent work paints a very different picture of the role of ritual and the sacred (2012 chapter 
3 [2017]), revising many of his central claims in the Theory of Communicative Action. Aside from 
these revisions to his social theory, Habermas’s postsecular writings address the philosophical theme 
of the relationship between religious faith and philosophical reason (§7.1) as well as the political 
theme of religion’s place within deliberative democracy (§7.2). 


7.1 Jerusalem and Athens: Religious Faith 
and Philosophical Reason 


Habermas has, since phase two of discourse ethics in the 1990s, considered religious traditions in 
modern societies to be fruitful sources of ethical values (1983 [1990]; 2001b chapter 1 [2003b]). 
Since postmetaphysical philosophy refrains from proposing concrete visions of the good, modern 
subjects must “appropriate” or “translate” insights from religion (alongside art and literature) to use 
as inputs in their ethical-existential and ethical-political discourses. Maeve Cooke suggests that this 
process of ethical appropriation involves re-presenting the semantic contents of ethical insights, 
shorn of their religious, literary, and artistic contexts, such that their exemplary force in disclosing 
visions of the good remains intact and can be used in secular ethical discourses (Cooke 2011). 


Philosophy, too, has a long history of appropriating or translating concepts from religious traditions 
(2012 chapter 4 [2017: 63-4]). Many apparently secular philosophical ideas have genealogies 
stretching back to religion. Alongside well-known examples such as Schelling and Hegel’s concept of 
the Absolute (2005c: 304), Benjamin’s use of the Messiah, and Adorno’s use of the ban on graven 
images (2005b chapter 8 [2008a: 232]), Habermas discusses several examples from Kant: 

the summum bonum, the ethical community, moral faith, radical evil, and even the moral law itself 
can be seen as secular philosophical translations of the kingdom of God, the church, religious faith, 
original sin, and the Ten Commandments (2001b chapter 2 [2003b: 110]; 2005b chapter 11 [2008a: 
220-3, 224-6]). Habermas lists universal egalitarianism and communicative action (2002 chapter 8: 
149, 160) as concepts from his own work which have religious genealogies. Although some of the 
original concepts’ meaning is inevitably lost in translation (2005c: 309; 2002 chapter 8: 164), such 
“critical assimilation(s) of religious concepts” or “secularizing, but at the same time salvaging, 
deconstruction(s) of religious truths” (2003c: 110) can enrich the vocabulary of postmetaphysical 
philosophy (2005b chapter 5 [2008a: 142]). Despite this, Habermas insists that philosophical 
discourse itself remains secular, retaining its “methodological atheism” (2005c: 304, 309; 2002 
chapter 8: 160)—it appropriates religious concepts, but they do not remain religious concepts. They 
must be contestable in justificatory discourses. 


These conceptual appropriations link the apparently secular philosophical tradition to the major 
world religions, and focusing on them helps to establish a new postsecular self-understanding of 
philosophy. Nonetheless, the process of appropriation could be said to place philosophy in a superior 
position to religion. It seems as if philosophy is able to judge which elements of religion count as 
rational enough to be worth appropriating, and which can be discarded as irrational; religion is 
reduced to a fund of concepts for philosophy’s use (2012 chapter 4 [2017: 63]). Habermas resists this 
interpretation, arguing against a Kantian view in which pure rational faith should eventually replace 
historical faiths (2005b chapter 8 [2008a: 223]), or a Hegelian view in which religion is one moment 
in the dialectic of absolute spirit, soon to be sublated into philosophy (2005b chapter 8 [2008a: 230- 
1]). A truly postsecular self-understanding of philosophy, he claims, must move beyond this. 


Habermas attempts to deflate philosophy’s stance of superiority with regard to religion by arguing 
that philosophical reason and religious faith have a shared origin. Both originated in the Axial Age, 
the intellectual revolution which took place in Greece, Israel, India, and China between 800 and 200 
BCE, as described by Karl Jaspers (Jaspers 1949 [1953]). This transition from mythos to logos saw 
the emergence of universal, context-transcending thinking, and a deepening of human subjectivity 
and ethical thought. The Western philosophical tradition beginning with Socrates, Plato, and Aristotle 
is as much an axial phenomenon as Judaism, Buddhism, and Confucianism (2012 chapter 4 [2017: 
66-9]; Rees 2017: 221-2). As Habermas puts it, 


both modes, faith and knowledge, together with their 
traditions based respectively in Jerusalem and Athens, 
belong to the history of the origins of the secular 
reason which today provides the medium in which the 


sons and daughters of modernity communicate 
concerning their place in the world. (2008c [2010: 17]) 


Religious faith is not, then, the “opaque other of reason” (2005b chapter 5 [2008a: 142]). Rather than 
being hostile strangers, philosophy and religion are in reality estranged siblings and equal partners 
in a fruitful dialogue (2008c [2010: 17-18]). 


7.2 Postsecular Deliberative Democracy 


In writings since the turn of the millennium, Habermas adapts his discourse theory of law and 
democracy so as to take account of the postsecular nature of modern societies. He uses this 
expression 


to describe modern societies which must assume that 
religious groups will continue to exist and that different 
religious traditions will remain relevant, even if the 
societies themselves are largely secularized. (2012 
chapter 4 [2017: 63]) 


His postsecular political theory, then, applies mostly to European and other Western countries with 
an established tradition of secular democracy, which must take account of the presence of religious 
minorities (2009 chapter 5: 59). One alleged problem with earlier models of deliberative democracy, 
such as those outlined by Rawls in Political Liberalism or by Habermas himself in Between Facts and 
Norms, is that they place unfair cognitive burdens on religious citizens of secular states, and in doing 
so threaten to undermine these states’ legitimacy. The central issue is the exclusion of religious 
language from the public sphere, and it can be addressed by modifying the discourse theory of law 
and democracy so as to let religious citizens participate fully. Habermas identifies two problems with 
secular deliberative democracy. It forces religious citizens to “split their identities” between public 
(secular) and private (religious) personae (2005b chapter 5 [2008a: 126-7, 130]), and places 
asymmetrical burdens upon them, compared with their secular fellow citizens. Consider Rawls’s 
“duty of civility” which required citizens not to make use of their comprehensive doctrines while 
deliberating in public, and to restrict themselves to political conceptions of justice, which are within 
the bounds of public reason (Rawls 1995 [1996: 217]). Or consider Rawls’s later, more moderate 
view, namely the proviso which allows that citizens can adduce “comprehensive reasons” in public 
political discussion, provided that “in due course” these are replaced by proper political reasons 
(Rawls 1993 [2005: 462, 1]). Following Paul Weithman and Nicholas Wolterstorff (Weithman 2002; 
Audi & Wolterstorff 1997), Habermas objects to Rawls’s proviso for placing morally unacceptable 
burdens on citizens of faith, since 


many religious citizens would not be able to undertake 
such an artificial division within their own minds 
without jeopardizing the pious conduct of their lives. 
(2005b chapter 5 [2008a: 127]) 


He also claims that burden is unfairly distributed, weighing only on believers (2001b chapter 2 
[2003b: 109]; 2005b chapter 5 [2008a: 136]). Some critics have countered that identity-splitting can 
also affect non-believers (Boettcher 2009; Holst & Molander 2015), while others rejoin that it is a 
reasonable demand to make of modern democratic citizens in pluralist societies, which does not 
threaten their complex identities (Mautner 2014: 24; Finlayson 2019: 11). Habermas, however, 
argues that both objections apply to Rawls’s idea of public reason, governed by the proviso. 


Now, Habermas’s account of legitimate law-making in Between Facts and Norms is wedded to the 
principle of the separation of church and state, and has an unabashedly secular conception of 
politics. The principle of democracy implies that laws are legitimate when citizens’ contributions to 
the informal public sphere filter through into the formal public sphere (the state apparatus, 
parliaments, and legal systems), and influence law-making (1992b [1996b: 371-2, 441-2]). When this 
cycle of feedback is operating correctly, citizens can understand themselves as the “authors and 
addressees” of the law (1992b [1996b: 120]), since they have indirectly participated in the legislative 
process. 


However, religious citizens cannot contribute authentically to a secular public sphere. Their 
contributions to public discourse, if phrased in secular language, will have little meaning for them, 
and thus they will find themselves in the heteronomous position of only being the addressees of the 
law, not its authors (2005b chapter 5 [2008a: 128, 130]). Habermas fears that the secular nature of 
the public sphere prevents religious citizens from fully taking part in the discursive processes which 
legitimate laws. They might comply strategically, seeing the law as a mere fact, and not a norm, but 
for them the laws passed by secular democratic states would lack legitimacy. There is evidently a 
danger here of large numbers of religious believers becoming alienated from democratic law-making, 
with the concomitant danger of political instability and a legitimation crisis (2009 chapter 5: 76). 


The problem is how to construe and design the modern political system in a way that is more 
congenial to religious citizens, without abandoning the secular political state, and how to show that 
the Weithman/Wolterstorff objections, which Habermas agrees apply Rawls’s proviso, do not apply to 


his own theory. His proposed solution, a model of postsecular deliberative democracy, has two 
elements corresponding respectively to the formal and informal public spheres; namely to the 
political system and civil society. 


First, echoing Rawls, Habermas proposes an “institutional translation proviso” at the threshold 
between the informal and formal public spheres. In the formal public sphere, public officials, 
politicians, and judges must restrict themselves to secular language, but in the informal public 
sphere, ordinary citizens are free to contribute to public discourse in religious language (2009 
chapter 5: 76; 2005b chapter 5 [2008a 130-2]). Before these religious statements pass through into 
the formal public sphere and impact the legislative process, they are to be translated into secular 
terms, a cooperative task in which both religious and non-religious citizens take part (2005b chapter 
5 [2008a: 112-3]; 2012 chapter 7 [2017: 172]). The institutional translation proviso thus acts as a 
filter, maintaining the secular nature of the state, but allowing religious citizens free reign to air their 
religious reasons in informal public discourse (2005b chapter 5 [2008a: 131]). 


Second, religious and non-religious citizens in civil society must undergo complementary learning 
processes, leading to them becoming reflexive about their beliefs (2005b chapter 5 [2008a: 111-2]). 
For religious believers, this involves coming to terms with three aspects of modern society: 
reasonable pluralism, the priority of scientific knowledge, and the secular nature of the state (2001b 
chapter 2 [2003b: 104]; 2005b chapter 5 [2008a: 137]; 2012 chapter 7 [2017: 173]). For non- 
believers, it involves accepting the continuing presence of religion in modern society, coming to view 
their disagreements with believers as reasonable, accepting that believers have a right to contribute 
to the public sphere in religious language, being willing to help translate those contributions into 
secular language, and finally forgoing scientism, and scientistically motivated atheism, in favour of 
political agnosticism (2005b chapter 4 [2008a: 113]; 2005b chapter 8 [2008a: 263-4]; 2005b chapter 
10 [2008a: 309-10]; see also Baxter 2011: 205). 


Taken together, Habermas claims, these learning processes help to equalize the cognitive burdens 
borne by religious and non-religious citizens. This model of postsecular deliberative democracy, 
Habermas argues, solves the problems of identity-splitting and de-legitimation which secularism 
inflicts on religious believers. Since they can now contribute to the informal public sphere in religious 
language, believers no longer have to split their identities in two; since they know that their 
religiously based contributions are making an impact on the formal public sphere, they can see the 
laws which it produces as legitimate. 


Even if the religious language is the only one which 
they speak in public, and if religiously justified opinions 
are the only ones they can or wish to contribute to 
political controversies, they nevertheless understand 
themselves as members of a civitas terrena, which 
empowers them to be the authors of laws to which they 
are subject as addressees. (2005b chapter 5 [2008a: 
130-1]) 


Habermas’s theory has come in for criticism from all sides. On the one hand critics like Amy Allen 
claim that there is a residual asymmetry, and that Habermas still “stacks the decks in favour of 
secularism” (Allen 2013: 149). Wolterstorff agrees, though he thinks the problem lies with the idea of 
“post-metaphysical reason”, while Allen traces it to his “genealogy of post-secular reason”. Cristina 
Lafont argues, by contrast, that the asymmetric burden falls the other way, since the cognitive 
learning process imposes a duty on secular citizens to give up their scientistically motivated atheism 
in favour of politically motivated agnosticism (Lafont 2013: 238; Finlayson 2019: 14). The idea of 
“sacred-to-secular translation” is central to Habermas’s model of postsecular deliberative democracy. 
As an example, Habermas cites German Christian groups translating the statement from Genesis that 
“God created man in his own image” into secular language as “a gamete fertilized ex utero has the 
status of a subject of human rights” as part of their arguments against stem-cell research (2001b 
chapter 2 [2003b: 109]). Yet the idea has proved controversial. It is not clear, for example, whether 
or not Habermas considers it to work the same way as ethical and philosophical appropriations of 
religious concepts (see Cooke 2011; Kerkwijk 2015; Rees 2018: 143-65). For Wolterstorff it relies on 
the idea of post-metaphysical reason that is skewed in favour of secularism, while Rees argues that 
the idea is empty, since no such translations are possible. 


